1,552 results on '"Convolutional neural network"'
Search Results
2. A Combined CNN Architecture for Speech Emotion Recognition.
- Author
-
Begazo, Rolinson, Aguilera, Ana, Dongo, Irvin, and Cardinale, Yudith
- Subjects
- *
CONVOLUTIONAL neural networks , *EMOTION recognition , *SPEECH perception , *FEATURE selection , *EMOTIONS - Abstract
Emotion recognition through speech is a technique employed in various scenarios of Human–Computer Interaction (HCI). Existing approaches have achieved significant results; however, limitations persist, with the quantity and diversity of data being more notable when deep learning techniques are used. The lack of a standard in feature selection leads to continuous development and experimentation. Choosing and designing the appropriate network architecture constitutes another challenge. This study addresses the challenge of recognizing emotions in the human voice using deep learning techniques, proposing a comprehensive approach, and developing preprocessing and feature selection stages while constructing a dataset called EmoDSc as a result of combining several available databases. The synergy between spectral features and spectrogram images is investigated. Independently, the weighted accuracy obtained using only spectral features was 89%, while using only spectrogram images, the weighted accuracy reached 90%. These results, although surpassing previous research, highlight the strengths and limitations when operating in isolation. Based on this exploration, a neural network architecture composed of a CNN1D, a CNN2D, and an MLP that fuses spectral features and spectogram images is proposed. The model, supported by the unified dataset EmoDSc, demonstrates a remarkable accuracy of 96%. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. New Fault Diagnosis Method for Rolling Bearings Based on Improved Residual Shrinkage Network Combined with Transfer Learning.
- Author
-
Sun, Tieyang and Gao, Jianxiong
- Subjects
- *
CONVOLUTIONAL neural networks , *ROLLER bearings , *RANDOM noise theory , *ROLLING friction , *SIGNAL processing , *FAULT diagnosis , *WHITE noise - Abstract
The fault diagnosis of rolling bearings is faced with the problem of a lack of fault data. Currently, fault diagnosis based on traditional convolutional neural networks decreases the diagnosis rate. In this paper, the developed adaptive residual shrinkage network model is combined with transfer learning to solve the above problems. The model is trained on the Case Western Reserve dataset, and then the trained model is migrated to a small-sample dataset with a scaled-down sample size and the Jiangnan University bearing dataset to conduct the experiments. The experimental results show that the proposed method can efficiently learn from small-sample datasets, improving the accuracy of the fault diagnosis of bearings under variable loads and variable speeds. The adaptive parameter-rectified linear unit is utilized to adapt the nonlinear transformation. When rolling bearings are in operation, noise production is inevitable. In this paper, soft thresholding and an attention mechanism are added to the model, which can effectively process vibration signals with strong noise. In this paper, the real noise is simulated by adding Gaussian white noise in migration task experiments on small-sample datasets. The experimental results show that the algorithm has noise resistance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. Ultrasonic Assessment of Liver Fibrosis Using One-Dimensional Convolutional Neural Networks Based on Frequency Spectra of Radiofrequency Signals with Deep Learning Segmentation of Liver Regions in B-Mode Images: A Feasibility Study.
- Author
-
Ai, Haiming, Huang, Yong, Tai, Dar-In, Tsui, Po-Hsiang, and Zhou, Zhuhuang
- Subjects
- *
CONVOLUTIONAL neural networks , *HEPATIC fibrosis , *DEEP learning , *FREQUENCY spectra , *LIVER biopsy - Abstract
The early detection of liver fibrosis is of significant importance. Deep learning analysis of ultrasound backscattered radiofrequency (RF) signals is emerging for tissue characterization as the RF signals carry abundant information related to tissue microstructures. However, the existing methods only used the time-domain information of the RF signals for liver fibrosis assessment, and the liver region of interest (ROI) is outlined manually. In this study, we proposed an approach for liver fibrosis assessment using deep learning models on ultrasound RF signals. The proposed method consisted of two-dimensional (2D) convolutional neural networks (CNNs) for automatic liver ROI segmentation from reconstructed B-mode ultrasound images and one-dimensional (1D) CNNs for liver fibrosis stage classification based on the frequency spectra (amplitude, phase, and power) of the segmented ROI signals. The Fourier transform was used to obtain the three kinds of frequency spectra. Two classical 2D CNNs were employed for liver ROI segmentation: U-Net and Attention U-Net. ROI spectrum signals were normalized and augmented using a sliding window technique. Ultrasound RF signals collected (with a 3-MHz transducer) from 613 participants (Group A) were included for liver ROI segmentation and those from 237 participants (Group B) for liver fibrosis stage classification, with a liver biopsy as the reference standard (Fibrosis stage: F0 = 27, F1 = 49, F2 = 51, F3 = 49, F4 = 61). In the test set of Group A, U-Net and Attention U-Net yielded Dice similarity coefficients of 95.05% and 94.68%, respectively. In the test set of Group B, the 1D CNN performed the best when using ROI phase spectrum signals to evaluate liver fibrosis stages ≥F1 (area under the receive operating characteristic curve, AUC: 0.957; accuracy: 89.19%; sensitivity: 85.17%; specificity: 93.75%), ≥F2 (AUC: 0.808; accuracy: 83.34%; sensitivity: 87.50%; specificity: 78.57%), and ≥F4 (AUC: 0.876; accuracy: 85.71%; sensitivity: 77.78%; specificity: 94.12%), and when using the power spectrum signals to evaluate ≥F3 (AUC: 0.729; accuracy: 77.14%; sensitivity: 77.27%; specificity: 76.92%). The experimental results demonstrated the feasibility of both the 2D and 1D CNNs in liver parenchyma detection and liver fibrosis characterization. The proposed methods have provided a new strategy for liver fibrosis assessment based on ultrasound RF signals, especially for early fibrosis detection. The findings of this study shed light on deep learning analysis of ultrasound RF signals in the frequency domain with automatic ROI segmentation. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. A Deep Learning Approach to Distance Map Generation Applied to Automatic Fiber Diameter Computation from Digital Micrographs.
- Author
-
Alejo Huarachi, Alain M. and Beltrán Castañón, César A.
- Subjects
- *
CONVOLUTIONAL neural networks , *TEXTILE fiber industry , *SYNTHETIC textiles , *COMPUTER vision , *ANIMAL fibers , *DEEP learning - Abstract
Precise measurement of fiber diameter in animal and synthetic textiles is crucial for quality assessment and pricing; however, traditional methods often struggle with accuracy, particularly when fibers are densely packed or overlapping. Current computer vision techniques, while useful, have limitations in addressing these challenges. This paper introduces a novel deep-learning-based method to automatically generate distance maps of fiber micrographs, enabling more accurate fiber segmentation and diameter calculation. Our approach utilizes a modified U-Net architecture, trained on both real and simulated micrographs, to regress distance maps. This allows for the effective separation of individual fibers, even in complex scenarios. The model achieves a mean absolute error (MAE) of 0.1094 and a mean square error (MSE) of 0.0711 , demonstrating its effectiveness in accurately measuring fiber diameters. This research highlights the potential of deep learning to revolutionize fiber analysis in the textile industry, offering a more precise and automated solution for quality control and pricing. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. Shape Classification Using a Single Seal-Whisker-Style Sensor Based on the Neural Network Method.
- Author
-
Mao, Yitian, Lv, Yingxue, Wang, Yaohong, Yuan, Dekui, Liu, Luyao, Song, Ziyu, and Ji, Chunning
- Subjects
- *
CONVOLUTIONAL neural networks , *HARBOR seal , *FOURIER analysis , *AQUATIC animals , *SIGNAL sampling - Abstract
Seals, sea lions, and other aquatic animals rely on their whiskers to identify and track underwater targets, offering valuable inspiration for the development of low-power, portable, and environmentally friendly sensors. Here, we design a single seal-whisker-like cylinder and conduct experiments to measure the forces acting on it with nine different upstream targets. Using sample sets constructed from these force signals, a convolutional neural network (CNN) is trained and tested. The results demonstrate that combining the seal-whisker-style sensor with a CNN enables the identification of objects in the water in most cases, although there may be some confusion for certain targets. Increasing the length of the signal samples can enhance the results but may not eliminate these confusions. Our study reveals that high frequencies (greater than 5 Hz) are irrelevant in our model. Lift signals present more distinct and distinguishable features than drag signals, serving as the primary basis for the model to differentiate between various targets. Fourier analysis indicates that the model's efficacy in recognizing different targets relies heavily on the discrepancies in the spectral features of the lift signals. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. Doppler Radar Sensor-Based Fall Detection Using a Convolutional Bidirectional Long Short-Term Memory Model.
- Author
-
Li, Zhikun, Du, Jiajun, Zhu, Baofeng, Greenwald, Stephen E., Xu, Lisheng, Yao, Yudong, and Bao, Nan
- Subjects
- *
CONVOLUTIONAL neural networks , *DOPPLER radar , *DEEP learning , *FREQUENCY spectra , *QUALITY of life - Abstract
Falls among the elderly are a common and serious health risk that can lead to physical injuries and other complications. To promptly detect and respond to fall events, radar-based fall detection systems have gained widespread attention. In this paper, a deep learning model is proposed based on the frequency spectrum of radar signals, called the convolutional bidirectional long short-term memory (CB-LSTM) model. The introduction of the CB-LSTM model enables the fall detection system to capture both temporal sequential and spatial features simultaneously, thereby enhancing the accuracy and reliability of the detection. Extensive comparison experiments demonstrate that our model achieves an accuracy of 98.83% in detecting falls, surpassing other relevant methods currently available. In summary, this study provides effective technical support using the frequency spectrum and deep learning methods to monitor falls among the elderly through the design and experimental validation of a radar-based fall detection system, which has great potential for improving quality of life for the elderly and providing timely rescue measures. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. Exploring the Processing Paradigm of Input Data for End-to-End Deep Learning in Tool Condition Monitoring.
- Author
-
Wang, Chengguan, Wang, Guangping, Wang, Tao, Xiong, Xiyao, Ouyang, Zhongchuan, and Gong, Tao
- Subjects
- *
CONVOLUTIONAL neural networks , *MACHINE learning , *COMPUTER input design , *STANDARD deviations , *SIGNAL processing , *DEEP learning - Abstract
Tool condition monitoring technology is an indispensable part of intelligent manufacturing. Most current research focuses on complex signal processing techniques or advanced deep learning algorithms to improve prediction performance without fully leveraging the end-to-end advantages of deep learning. The challenge lies in transforming multi-sensor raw data into input data suitable for direct model feeding, all while minimizing data scale and preserving sufficient temporal interpretation of tool wear. However, there is no clear reference standard for this so far. In light of this, this paper innovatively explores the processing methods that transform raw data into input data for deep learning models, a process known as an input paradigm. This paper introduces three new input paradigms: the downsampling paradigm, the periodic paradigm, and the subsequence paradigm. Then an improved hybrid model that combines a convolutional neural network (CNN) and bidirectional long short-term memory (BiLSTM) was employed to validate the model's performance. The subsequence paradigm demonstrated considerable superiority in prediction results based on the PHM2010 dataset, as the newly generated time series maintained the integrity of the raw data. Further investigation revealed that, with 120 subsequences and the temporal indicator being the maximum value, the model's mean absolute error (MAE) and root mean square error (RMSE) were the lowest after threefold cross-validation, outperforming several classical and contemporary methods. The methods explored in this paper provide references for designing input data for deep learning models, helping to enhance the end-to-end potential of deep learning models, and promoting the industrial deployment and practical application of tool condition monitoring systems. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. Explainable Deep Learning-Based Feature Selection and Intrusion Detection Method on the Internet of Things.
- Author
-
Chen, Xuejiao, Liu, Minyao, Wang, Zixuan, and Wang, Yun
- Subjects
- *
CONVOLUTIONAL neural networks , *COMPUTER network traffic , *FEATURE selection , *DEEP learning , *TRAFFIC monitoring - Abstract
With the rapid advancement of the Internet of Things, network security has garnered increasing attention from researchers. Applying deep learning (DL) has significantly enhanced the performance of Network Intrusion Detection Systems (NIDSs). However, due to its complexity and "black box" problem, deploying DL-based NIDS models in practical scenarios poses several challenges, including model interpretability and being lightweight. Feature selection (FS) in DL models plays a crucial role in minimizing model parameters and decreasing computational overheads while enhancing NIDS performance. Hence, selecting effective features remains a pivotal concern for NIDSs. In light of this, this paper proposes an interpretable feature selection method for encrypted traffic intrusion detection based on SHAP and causality principles. This approach utilizes the results of model interpretation for feature selection to reduce feature count while ensuring model reliability. We evaluate and validate our proposed method on two public network traffic datasets, CICIDS2017 and NSL-KDD, employing both a CNN and a random forest (RF). Experimental results demonstrate superior performance achieved by our proposed method. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. Optimized Radio Frequency Footprint Identification Based on UAV Telemetry Radios.
- Author
-
Tian, Yuan, Wen, Hong, Zhou, Jiaxin, Duan, Zhiqiang, and Li, Tao
- Subjects
- *
CONVOLUTIONAL neural networks , *RADIO telemetry , *NO-fly zones , *RADIO frequency identification systems , *HUMAN fingerprints - Abstract
With the widespread use of unmanned aerial vehicles (UAVs), the detection and identification of UAVs is a vital security issue for the safety of airspace and ground facilities in the no-fly zone. Telemetry radios are important wireless communication devices for UAVs, especially in UAVs beyond the visual line of sight (BVLOS) operating mode. This work focuses on the UAV identification approach using transient signals from UAV telemetry radios instead of the signals from UAV controllers that the former research work depended on. In our novel UAV Radio Frequency (RF) identification system framework based on telemetry radio signals, the E C − α algorithm is optimized to detect the starting point of the UAV transient signal and the detection accuracy at different signal-to-noise ratios (SNR) is evaluated. In the training stage, the Convolutional Neural Network (CNN) model is trained to extract features from raw I/Q data of the transient signals with different waveforms. Its architecture and hyperparameters are analyzed and optimized. In the identification stage, the extracted transient signals are clustered through the Self-Organizing Map (SOM) algorithm and the Clustering Signals Joint Identification (CSJI) algorithm is proposed to improve the accuracy of RF fingerprint identification. To evaluate the performance of our proposed approach, we design a testbed, including two UAVs as the flight platform, a Universal Software Radio Peripheral (USRP) as the receiver, and 20 telemetry radios with the same model as targets for identification. Indoor test results show that the optimized identification approach achieves an average accuracy of 92.3% at 30 dB. In comparison, the identification accuracy of SVM and KNN is 69.7% and 74.5%, respectively, at the same SNR condition. Extensive experiments are conducted outdoors to demonstrate the feasibility of this approach. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. Fast, Zero-Reference Low-Light Image Enhancement with Camera Response Model.
- Author
-
Wang, Xiaofeng, Huang, Liang, Li, Mingxuan, Han, Chengshan, Liu, Xin, and Nie, Ting
- Subjects
- *
CONVOLUTIONAL neural networks , *IMAGE intensifiers , *CAMERAS , *RADIATION - Abstract
Low-light images are prevalent in intelligent monitoring and many other applications, with low brightness hindering further processing. Although low-light image enhancement can reduce the influence of such problems, current methods often involve a complex network structure or many iterations, which are not conducive to their efficiency. This paper proposes a Zero-Reference Camera Response Network using a camera response model to achieve efficient enhancement for arbitrary low-light images. A double-layer parameter-generating network with a streamlined structure is established to extract the exposure ratio K from the radiation map, which is obtained by inverting the input through a camera response function. Then, K is used as the parameter of a brightness transformation function for one transformation on the low-light image to realize enhancement. In addition, a contrast-preserving brightness loss and an edge-preserving smoothness loss are designed without the requirement for references from the dataset. Both can further retain some key information in the inputs to improve precision. The enhancement is simplified and can reach more than twice the speed of similar methods. Extensive experiments on several LLIE datasets and the DARK FACE face detection dataset fully demonstrate our method's advantages, both subjectively and objectively. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. A Convolutional Neural Network with Multifrequency and Structural Similarity Loss Functions for Electromagnetic Imaging.
- Author
-
Chiu, Chien-Ching, Lin, Che-Yu, Chi, Yu-Jen, Hsu, Hsiu-Hui, Chen, Po-Hsiang, and Jiang, Hao
- Subjects
- *
CONVOLUTIONAL neural networks , *BACK propagation , *STANDARD deviations , *ARTIFICIAL intelligence , *MAGNETIC anomalies - Abstract
In this paper, artificial intelligence (AI) technology is applied to the electromagnetic imaging of anisotropic objects. Advances in magnetic anomaly sensing systems and electromagnetic imaging use electromagnetic principles to detect and characterize subsurface or hidden objects. We use measured multifrequency scattered fields to calculate the initial dielectric constant distribution of anisotropic objects through the backpropagation scheme (BPS). Later, the estimated multifrequency permittivity distribution is input to a convolutional neural network (CNN) for the adaptive moment estimation (ADAM) method to reconstruct a more accurate image. In the meantime, we also improve the definition of loss function in the CNN. Numerical results show that the improved loss function unifying the structural similarity index measure (SSIM) and root mean square error (RMSE) can effectively enhance image quality. In our simulation environment, noise interference is considered for both TE (transverse electric) and TM (transverse magnetic) waves to reconstruct anisotropic scatterers. Lastly, we conclude that multifrequency reconstructions are more stable and precise than single-frequency reconstructions. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
13. Dune Morphology Classification and Dataset Construction Method Based on Unmanned Aerial Vehicle Orthoimagery.
- Author
-
Li, Ming, Yang, Zekun, Yan, Jiehua, Li, Haoran, and Ye, Wangzhong
- Subjects
- *
CONVOLUTIONAL neural networks , *IMAGE recognition (Computer vision) , *DRONE aircraft , *DATA augmentation , *FIELD research , *SAND dunes - Abstract
Dunes are the primary geomorphological type in deserts, and the distribution of dune morphologies is of significant importance for studying regional characteristics, formation mechanisms, and evolutionary processes. Traditional dune morphology classification methods rely on visual interpretation by humans, which is not only time-consuming and inefficient but also subjective in classification judgment. These issues have impeded the intelligent development of dune morphology classification. However, convolutional neural network (CNN) models exhibit robust feature representation capabilities for images and have achieved excellent results in image classification, providing a new method for studying dune morphology classification. Therefore, this paper summarizes five typical dune morphologies in the deserts of western Inner Mongolia, which can be used to define and describe most of the dune types in Chinese deserts. Subsequently, field surveys and the experimental collection of unmanned aerial vehicle (UAV) orthoimages for different dune types were conducted. Five different types of dune morphology datasets were constructed through manual segmentation, automatic rule segmentation, random screening, and data augmentation. Finally, the classification of dune morphologies and the exploration of dataset construction methods were conducted using the VGG16 and VGG19 CNN models. The classification results of dune morphologies were comprehensively analyzed using different evaluation metrics. The experimental results indicate that when the regular segmentation scale of UAV orthoimages is 1024 × 1024 pixels with an overlap of 100 pixels, the classification accuracy, precision, recall, and F1-Score of the VGG16 model reached 97.05%, 96.91%, 96.76%, and 96.82%, respectively. The method for constructing a dune morphology dataset from automatically segmented UAV orthoimages provides a reference value for the study of large-scale dune morphology classification. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
14. An Empirical Study on the Effect of Training Data Perturbations on Neural Network Robustness.
- Author
-
Wang, Jie, Wu, Zili, Lu, Minyan, and Ai, Jun
- Subjects
- *
CONVOLUTIONAL neural networks , *DATA augmentation , *INSTRUCTIONAL systems , *EMPIRICAL research , *DATA modeling , *DEEP learning - Abstract
The vulnerability of modern neural networks to random noise and deliberate attacks has raised concerns about their robustness, particularly as they are increasingly utilized in safety- and security-critical applications. Although recent research efforts were made to enhance robustness through retraining with adversarial examples or employing data augmentation techniques, a comprehensive investigation into the effects of training data perturbations on model robustness remains lacking. This paper presents the first extensive empirical study investigating the influence of data perturbations during model retraining. The experimental analysis focuses on both random and adversarial robustness, following established practices in the field of robustness analysis. Various types of perturbations in different aspects of the dataset are explored, including input, label, and sampling distribution. Single-factor and multi-factor experiments are conducted to assess individual perturbations and their combinations. The findings provide insights into constructing high-quality training datasets for optimizing robustness and recommend the appropriate degree of training set perturbations that balance robustness and correctness, and contribute to understanding model robustness in deep learning and offer practical guidance for enhancing model performance through perturbed retraining, promoting the development of more reliable and trustworthy deep learning systems for safety-critical applications. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
15. Integrating the Capsule-like Smart Aggregate-Based EMI Technique with Deep Learning for Stress Assessment in Concrete.
- Author
-
Ta, Quoc-Bao, Pham, Quang-Quang, Pham, Ngoc-Lan, and Kim, Jeong-Tae
- Subjects
- *
DEEP learning , *CONVOLUTIONAL neural networks , *CONCRETE , *COMPRESSION loads , *DEGREES of freedom - Abstract
This study presents a concrete stress monitoring method utilizing 1D CNN deep learning of raw electromechanical impedance (EMI) signals measured with a capsule-like smart aggregate (CSA) sensor. Firstly, the CSA-based EMI measurement technique is presented by depicting a prototype of the CSA sensor and a 2 degrees of freedom (2 DOFs) EMI model for the CSA sensor embedded in a concrete cylinder. Secondly, the 1D CNN deep regression model is designed to adapt raw EMI responses from the CSA sensor for estimating concrete stresses. Thirdly, a CSA-embedded cylindrical concrete structure is experimented with to acquire EMI responses under various compressive loading levels. Finally, the feasibility and robustness of the 1D CNN model are evaluated for noise-contaminated EMI data and untrained stress EMI cases. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. Deep Learning Soft-Decision GNSS Multipath Detection and Mitigation.
- Author
-
Nunes, Fernando and Sousa, Fernando
- Subjects
- *
CONVOLUTIONAL neural networks , *GLOBAL Positioning System , *DISCRETE Fourier transforms , *ARTIFICIAL satellites in navigation , *CORRELATORS , *DEEP learning - Abstract
A technique is proposed to detect the presence of the multipath effect in Global Navigation Satellite Signal (GNSS) signals using a convolutional neural network (CNN) as the building block. The network is trained and validated, for a wide range of C / N 0 values, with a realistic dataset constituted by the synthetic noisy outputs of a 2D grid of correlators associated with different Doppler frequencies and code delays (time-domain dataset). Multipath-disturbed signals are generated in agreement with the various scenarios encompassed by the adopted multipath model. It was found that pre-processing the outputs of the correlators grid with the two-dimensional Discrete Fourier Transform (frequency-domain dataset) enables the CNN to improve the accuracy relative to the time-domain dataset. Depending on the kind of CNN outputs, two strategies can then be devised to solve the equation of navigation: either remove the disturbed signal from the equation (hard decision) or process the pseudoranges with a weighted least-squares algorithm, where the entries of the weighting matrix are computed using the analog outputs of the neural network (soft decision). [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. Multi-Directional Long-Term Recurrent Convolutional Network for Road Situation Recognition.
- Author
-
Dofitas Jr., Cyreneo, Gil, Joon-Min, and Byun, Yung-Cheol
- Subjects
- *
ARTIFICIAL neural networks , *RECURRENT neural networks , *PEDESTRIANS , *ROAD safety measures , *CONVOLUTIONAL neural networks , *DEEP learning - Abstract
Understanding road conditions is essential for implementing effective road safety measures and driving solutions. Road situations encompass the day-to-day conditions of roads, including the presence of vehicles and pedestrians. Surveillance cameras strategically placed along streets have been instrumental in monitoring road situations and providing valuable information on pedestrians, moving vehicles, and objects within road environments. However, these video data and information are stored in large volumes, making analysis tedious and time-consuming. Deep learning models are increasingly utilized to monitor vehicles and identify and evaluate road and driving comfort situations. However, the current neural network model requires the recognition of situations using time-series video data. In this paper, we introduced a multi-directional detection model for road situations to uphold high accuracy. Deep learning methods often integrate long short-term memory (LSTM) into long-term recurrent network architectures. This approach effectively combines recurrent neural networks to capture temporal dependencies and convolutional neural networks (CNNs) to extract features from extensive video data. In our proposed method, we form a multi-directional long-term recurrent convolutional network approach with two groups equipped with CNN and two layers of LSTM. Additionally, we compare road situation recognition using convolutional neural networks, long short-term networks, and long-term recurrent convolutional networks. The paper presents a method for detecting and recognizing multi-directional road contexts using a modified LRCN. After balancing the dataset through data augmentation, the number of video files increased, resulting in our model achieving 91% accuracy, a significant improvement from the original dataset. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. Acoustic Signal-Based Defect Identification for Directed Energy Deposition-Arc Using Wavelet Time–Frequency Diagrams.
- Author
-
Zhang, Hui, Wu, Qianru, Tang, Wenlai, and Yang, Jiquan
- Subjects
- *
CONVOLUTIONAL neural networks , *MANUFACTURING defects , *ENERGY consumption , *MANUFACTURING processes , *IDENTIFICATION , *WAVELET transforms - Abstract
Several advantages of directed energy deposition-arc (DED-arc) have garnered considerable research attention including high deposition rates and low costs. However, defects such as discontinuity and pores may occur during the manufacturing process. Defect identification is the key to monitoring and quality assessments of the additive manufacturing process. This study proposes a novel acoustic signal-based defect identification method for DED-arc via wavelet time–frequency diagrams. With the continuous wavelet transform, one-dimensional (1D) acoustic signals acquired in situ during manufacturing are converted into two-dimensional (2D) time–frequency diagrams to train, validate, and test the convolutional neural network (CNN) models. In this study, several CNN models were examined and compared, including AlexNet, ResNet-18, VGG-16, and MobileNetV3. The accuracy of the models was 96.35%, 97.92%, 97.01%, and 98.31%, respectively. The findings demonstrate that the energy distribution of normal and abnormal acoustic signals has significant differences in both the time and frequency domains. The proposed method is verified to identify defects effectively in the manufacturing process and advance the identification time. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
19. mmWave-RM: A Respiration Monitoring and Pattern Classification System Based on mmWave Radar.
- Author
-
Hao, Zhanjun, Wang, Yue, Li, Fenfang, Ding, Guozhen, and Gao, Yifei
- Subjects
- *
CONVOLUTIONAL neural networks , *RESPIRATION , *RADAR , *SUPPORT vector machines , *VENTILATION monitoring , *PATIENT monitoring , *FEATURE extraction - Abstract
Breathing is one of the body's most basic functions and abnormal breathing can indicate underlying cardiopulmonary problems. Monitoring respiratory abnormalities can help with early detection and reduce the risk of cardiopulmonary diseases. In this study, a 77 GHz frequency-modulated continuous wave (FMCW) millimetre-wave (mmWave) radar was used to detect different types of respiratory signals from the human body in a non-contact manner for respiratory monitoring (RM). To solve the problem of noise interference in the daily environment on the recognition of different breathing patterns, the system utilised breathing signals captured by the millimetre-wave radar. Firstly, we filtered out most of the static noise using a signal superposition method and designed an elliptical filter to obtain a more accurate image of the breathing waveforms between 0.1 Hz and 0.5 Hz. Secondly, combined with the histogram of oriented gradient (HOG) feature extraction algorithm, K-nearest neighbours (KNN), convolutional neural network (CNN), and HOG support vector machine (G-SVM) were used to classify four breathing modes, namely, normal breathing, slow and deep breathing, quick breathing, and meningitic breathing. The overall accuracy reached up to 94.75%. Therefore, this study effectively supports daily medical monitoring. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
20. CSMC: A Secure and Efficient Visualized Malware Classification Method Inspired by Compressed Sensing.
- Author
-
Wu, Wei, Peng, Haipeng, Zhu, Haotian, and Zhang, Derun
- Subjects
- *
DEEP learning , *COMPRESSED sensing , *MALWARE , *INDUSTRIAL robots , *CONVOLUTIONAL neural networks , *INTELLIGENT sensors - Abstract
With the rapid development of the Internet of Things (IoT), the sophistication and intelligence of sensors are continually evolving, playing increasingly important roles in smart homes, industrial automation, and remote healthcare. However, these intelligent sensors face many security threats, particularly from malware attacks. Identifying and classifying malware is crucial for preventing such attacks. As the number of sensors and their applications grow, malware targeting sensors proliferates. Processing massive malware samples is challenging due to limited bandwidth and resources in IoT environments. Therefore, compressing malware samples before transmission and classification can improve efficiency. Additionally, sharing malware samples between classification participants poses security risks, necessitating methods that prevent sample exploitation. Moreover, the complex network environments also necessitate robust classification methods. To address these challenges, this paper proposes CSMC (Compressed Sensing Malware Classification), an efficient malware classification method based on compressed sensing. This method compresses malware samples before sharing and classification, thus facilitating more effective sharing and processing. By introducing deep learning, the method can extract malware family features during compression, which classical methods cannot achieve. Furthermore, the irreversibility of the method enhances security by preventing classification participants from exploiting malware samples. Experimental results demonstrate that for malware targeting Windows and Android operating systems, CSMC outperforms many existing methods based on compressed sensing and machine or deep learning. Additionally, experiments on sample reconstruction and noise demonstrate CSMC's capabilities in terms of security and robustness. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
21. Real-Time Tool Localization for Laparoscopic Surgery Using Convolutional Neural Network.
- Author
-
Benavides, Diego, Cisnal, Ana, Fontúrbel, Carlos, de la Fuente, Eusebio, and Fraile, Juan Carlos
- Subjects
- *
CONVOLUTIONAL neural networks , *LAPAROSCOPIC surgery , *SURGICAL equipment , *SURGICAL robots , *OPERATIVE surgery , *ARTIFICIAL intelligence - Abstract
Partially automated robotic systems, such as camera holders, represent a pivotal step towards enhancing efficiency and precision in surgical procedures. Therefore, this paper introduces an approach for real-time tool localization in laparoscopy surgery using convolutional neural networks. The proposed model, based on two Hourglass modules in series, can localize up to two surgical tools simultaneously. This study utilized three datasets: the ITAP dataset, alongside two publicly available datasets, namely Atlas Dione and EndoVis Challenge. Three variations of the Hourglass-based models were proposed, with the best model achieving high accuracy (92.86%) and frame rates (27.64 FPS), suitable for integration into robotic systems. An evaluation on an independent test set yielded slightly lower accuracy, indicating limited generalizability. The model was further analyzed using the Grad-CAM technique to gain insights into its functionality. Overall, this work presents a promising solution for automating aspects of laparoscopic surgery, potentially enhancing surgical efficiency by reducing the need for manual endoscope manipulation. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. E-Nose: Time–Frequency Attention Convolutional Neural Network for Gas Classification and Concentration Prediction.
- Author
-
Jiang, Minglv, Li, Na, Li, Mingyong, Wang, Zhou, Tian, Yuan, Peng, Kaiyan, Sheng, Haoran, Li, Haoyu, and Li, Qiang
- Subjects
- *
CONVOLUTIONAL neural networks , *PATTERN recognition systems , *ELECTRONIC noses , *DATA augmentation , *GAS detectors - Abstract
In the electronic nose (E-nose) systems, gas type recognition and accurate concentration prediction are some of the most challenging issues. This study introduced an innovative pattern recognition method of time–frequency attention convolutional neural network (TFA-CNN). A time–frequency attention block was designed in the network, aiming to excavate and effectively integrate the temporal and frequency domain information in the E-nose signals to enhance the performance of gas classification and concentration prediction tasks. Additionally, a novel data augmentation strategy was developed, manipulating the feature channels and time dimensions to reduce the interference of sensor drift and redundant information, thereby enhancing the model's robustness and adaptability. Utilizing two types of metal-oxide-semiconductor gas sensors, this research conducted qualitative and quantitative analysis on five target gases. The evaluation results showed that the classification accuracy could reach 100%, and the coefficient of the determination (R2) score of the regression task was up to 0.99. The Pearson correlation coefficient (r) was 0.99, and the mean absolute error (MAE) was 1.54 ppm. The experimental test results were almost consistent with the system predictions, and the MAE was 1.39 ppm. This study provides a method of network learning that combines time–frequency domain information, exhibiting high performance in gas classification and concentration prediction within the E-nose system. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. Research on FBG Tactile Sensing Shape Recognition Based on Convolutional Neural Network.
- Author
-
Lu, Guan, Shen, Zhihui, Cai, Ting, and Xu, Yiming
- Subjects
- *
CONVOLUTIONAL neural networks , *SUPPORT vector machines , *K-nearest neighbor classification , *TACTILE sensors , *FORM perception - Abstract
Shape recognition plays a significant role in the field of robot perception. In view of the low efficiency and few types of shape recognition of the fiber tactile sensor applied to flexible skin, a convolutional-neural-network-based FBG tactile sensing array shape recognition method was proposed. Firstly, a sensing array was fabricated using flexible resin and 3D printing technology. Secondly, a shape recognition system based on the tactile sensing array was constructed to collect shape data. Finally, shape classification recognition was performed using convolutional neural network, random forest, support vector machine, and k-nearest neighbor. The results indicate that the tactile sensing array exhibits good sensitivity and perception capability. The shape recognition accuracy of convolutional neural network is 96.58%, which is 6.11%, 9.44%, and 12.01% higher than that of random forest, k-nearest neighbor, and support vector machine. Its F1 is 96.95%, which is 6.3%, 8.73%, and 11.94% higher than random forest, k-nearest neighbor, and support vector machine. The research of FBG shape sensing array based on convolutional neural network provides an experimental basis for shape perception of flexible tactile sensing. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
24. DCFNet: Infrared and Visible Image Fusion Network Based on Discrete Wavelet Transform and Convolutional Neural Network.
- Author
-
Wu, Dan, Wang, Yanzhi, Wang, Haoran, Wang, Fei, and Gao, Guowang
- Subjects
- *
CONVOLUTIONAL neural networks , *DISCRETE wavelet transforms , *IMAGE fusion , *WAVELET transforms , *INFRARED imaging , *FEATURE extraction , *DATA mining - Abstract
Aiming to address the issues of missing detailed information, the blurring of significant target information, and poor visual effects in current image fusion algorithms, this paper proposes an infrared and visible-light image fusion algorithm based on discrete wavelet transform and convolutional neural networks. Our backbone network is an autoencoder. A DWT layer is embedded in the encoder to optimize frequency-domain feature extraction and prevent information loss, and a bottleneck residual block and a coordinate attention mechanism are introduced to enhance the ability to capture and characterize the low- and high-frequency feature information; an IDWT layer is embedded in the decoder to achieve the feature reconstruction of the fused frequencies; the fusion strategy adopts the l 1 − n o r m fusion strategy to integrate the encoder's output frequency mapping features; a weighted loss containing pixel loss, gradient loss, and structural loss is constructed for optimizing network training. DWT decomposes the image into sub-bands at different scales, including low-frequency sub-bands and high-frequency sub-bands. The low-frequency sub-bands contain the structural information of the image, which corresponds to the important target information, while the high-frequency sub-bands contain the detail information, such as edge and texture information. Through IDWT, the low-frequency sub-bands that contain important target information are synthesized with the high-frequency sub-bands that enhance the details, ensuring that the important target information and texture details are clearly visible in the reconstructed image. The whole process is able to reconstruct the information of different frequency sub-bands back into the image non-destructively, so that the fused image appears natural and harmonious visually. Experimental results on public datasets show that the fusion algorithm performs well according to both subjective and objective evaluation criteria and that the fused image is clearer and contains more scene information, which verifies the effectiveness of the algorithm, and the results of the generalization experiments also show that our network has good generalization ability. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
25. A Novel Method for Rolling Bearing Fault Diagnosis Based on Gramian Angular Field and CNN-ViT.
- Author
-
Zhou, Zijun, Ai, Qingsong, Lou, Ping, Hu, Jianmin, and Yan, Junwei
- Subjects
- *
FAULT diagnosis , *CONVOLUTIONAL neural networks , *ROLLER bearings , *TRANSFORMER models , *EDGE computing , *DIAGNOSIS methods - Abstract
Fault diagnosis is one of the important applications of edge computing in the Industrial Internet of Things (IIoT). To address the issue that traditional fault diagnosis methods often struggle to effectively extract fault features, this paper proposes a novel rolling bearing fault diagnosis method that integrates Gramian Angular Field (GAF), Convolutional Neural Network (CNN), and Vision Transformer (ViT). First, GAF is used to convert one-dimensional vibration signals from sensors into two-dimensional images, effectively retaining the fault features of the vibration signal. Then, the CNN branch is used to extract the local features of the image, which are combined with the global features extracted by the ViT branch to diagnose the bearing fault. The effectiveness of this method is validated with two datasets. Experimental results show that the proposed method achieves average accuracies of 99.79% and 99.63% on the CWRU and XJTU-SY rolling bearing fault datasets, respectively. Compared with several widely used fault diagnosis methods, the proposed method achieves higher accuracy for different fault classifications, providing reliable technical support for performing complex fault diagnosis on edge devices. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
26. Forecasting a Short-Term Photovoltaic Power Model Based on Improved Snake Optimization, Convolutional Neural Network, and Bidirectional Long Short-Term Memory Network.
- Author
-
Wang, Yonggang, Yao, Yilin, Zou, Qiuying, Zhao, Kaixing, and Hao, Yue
- Subjects
- *
CONVOLUTIONAL neural networks , *OPTIMIZATION algorithms , *PHOTOVOLTAIC power systems , *PEARSON correlation (Statistics) , *K-means clustering , *SNAKES , *STATISTICAL power analysis - Abstract
The precision of short-term photovoltaic power forecasts is of utmost importance for the planning and operation of the electrical grid system. To enhance the precision of short-term output power prediction in photovoltaic systems, this paper proposes a method integrating K-means clustering: an improved snake optimization algorithm with a convolutional neural network–bidirectional long short-term memory network to predict short-term photovoltaic power. Firstly, K-means clustering is utilized to categorize weather scenarios into three categories: sunny, cloudy, and rainy. The Pearson correlation coefficient method is then utilized to determine the inputs of the model. Secondly, the snake optimization algorithm is improved by introducing Tent chaotic mapping, lens imaging backward learning, and an optimal individual adaptive perturbation strategy to enhance its optimization ability. Then, the multi-strategy improved snake optimization algorithm is employed to optimize the parameters of the convolutional neural network–bidirectional long short-term memory network model, thereby augmenting the predictive precision of the model. Finally, the model established in this paper is utilized to forecast photovoltaic power in diverse weather scenarios. The simulation findings indicate that the regression coefficients of this method can reach 0.99216, 0.95772, and 0.93163 on sunny, cloudy, and rainy days, which has better prediction precision and adaptability under various weather conditions. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
27. Ultra-Wide Band Radar Empowered Driver Drowsiness Detection with Convolutional Spatial Feature Engineering and Artificial Intelligence.
- Author
-
Siddiqui, Hafeez Ur Rehman, Akmal, Ambreen, Iqbal, Muhammad, Saleem, Adil Ali, Raza, Muhammad Amjad, Zafar, Kainat, Zaib, Aqsa, Dudley, Sandra, Arambarri, Jon, Castilla, Ángel Kuc, and Rustam, Furqan
- Subjects
- *
ARTIFICIAL intelligence , *CONVOLUTIONAL neural networks , *GENERATIVE adversarial networks , *ULTRA-wideband radar , *DROWSINESS , *SUPPORT vector machines - Abstract
Driving while drowsy poses significant risks, including reduced cognitive function and the potential for accidents, which can lead to severe consequences such as trauma, economic losses, injuries, or death. The use of artificial intelligence can enable effective detection of driver drowsiness, helping to prevent accidents and enhance driver performance. This research aims to address the crucial need for real-time and accurate drowsiness detection to mitigate the impact of fatigue-related accidents. Leveraging ultra-wideband radar data collected over five minutes, the dataset was segmented into one-minute chunks and transformed into grayscale images. Spatial features are retrieved from the images using a two-dimensional Convolutional Neural Network. Following that, these features were used to train and test multiple machine learning classifiers. The ensemble classifier RF-XGB-SVM, which combines Random Forest, XGBoost, and Support Vector Machine using a hard voting criterion, performed admirably with an accuracy of 96.6%. Additionally, the proposed approach was validated with a robust k-fold score of 97% and a standard deviation of 0.018, demonstrating significant results. The dataset is augmented using Generative Adversarial Networks, resulting in improved accuracies for all models. Among them, the RF-XGB-SVM model outperformed the rest with an accuracy score of 99.58%. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
28. Efficient Haze Removal from a Single Image Using a DCP-Based Lightweight U-Net Neural Network Model.
- Author
-
Han, Yunho, Kim, Jiyoung, Lee, Jinyoung, Nah, Jae-Ho, Ho, Yo-Sung, and Park, Woo-Chan
- Subjects
- *
ARTIFICIAL neural networks , *CONVOLUTIONAL neural networks , *HAZE , *SIGNAL-to-noise ratio , *COMPUTATIONAL complexity - Abstract
In this paper, we propose a lightweight U-net architecture neural network model based on Dark Channel Prior (DCP) for efficient haze (fog) removal with a single input. The existing DCP requires high computational complexity in its operation. These computations are challenging to accelerate, and the problem is exacerbated when dealing with high-resolution images (videos), making it very difficult to apply to general-purpose applications. Our proposed model addresses this issue by employing a two-stage neural network structure, replacing the computationally complex operations of the conventional DCP with easily accelerated convolution operations to achieve high-quality fog removal. Furthermore, our proposed model is designed with an intuitive structure using a relatively small number of parameters (2M), utilizing resources efficiently. These features demonstrate the effectiveness and efficiency of the proposed model for fog removal. The experimental results show that the proposed neural network model achieves an average Peak Signal-to-Noise Ratio (PSNR) of 26.65 dB and a Structural Similarity Index Measure (SSIM) of 0.88, indicating an improvement in the average PSNR of 11.5 dB and in SSIM of 0.22 compared to the conventional DCP. This shows that the proposed neural network achieves comparable results to CNN-based neural networks that have achieved SOTA-class performance, despite its intuitive structure with a relatively small number of parameters. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
29. CIRF: Coupled Image Reconstruction and Fusion Strategy for Deep Learning Based Multi-Modal Image Fusion.
- Author
-
Zheng, Junze, Xiao, Junyan, Wang, Yaowei, and Zhang, Xuming
- Subjects
- *
IMAGE fusion , *IMAGE reconstruction , *CONVOLUTIONAL neural networks , *TRANSFORMER models , *MULTIMODAL user interfaces , *DEEP learning , *FEATURE extraction - Abstract
Multi-modal medical image fusion (MMIF) is crucial for disease diagnosis and treatment because the images reconstructed from signals collected by different sensors can provide complementary information. In recent years, deep learning (DL) based methods have been widely used in MMIF. However, these methods often adopt a serial fusion strategy without feature decomposition, causing error accumulation and confusion of characteristics across different scales. To address these issues, we have proposed the Coupled Image Reconstruction and Fusion (CIRF) strategy. Our method parallels the image fusion and reconstruction branches which are linked by a common encoder. Firstly, CIRF uses the lightweight encoder to extract base and detail features, respectively, through the Vision Transformer (ViT) and the Convolutional Neural Network (CNN) branches, where the two branches interact to supplement information. Then, two types of features are fused separately via different blocks and finally decoded into fusion results. In the loss function, both the supervised loss from the reconstruction branch and the unsupervised loss from the fusion branch are included. As a whole, CIRF increases its expressivity by adding multi-task learning and feature decomposition. Additionally, we have also explored the impact of image masking on the network's feature extraction ability and validated the generalization capability of the model. Through experiments on three datasets, it has been demonstrated both subjectively and objectively, that the images fused by CIRF exhibit appropriate brightness and smooth edge transition with more competitive evaluation metrics than those fused by several other traditional and DL-based methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
30. Transformers for Remote Sensing: A Systematic Review and Analysis.
- Author
-
Wang, Ruikun, Ma, Lei, He, Guangjun, Johnson, Brian Alan, Yan, Ziyun, Chang, Ming, and Liang, Ying
- Subjects
- *
TRANSFORMER models , *REMOTE sensing , *CONVOLUTIONAL neural networks , *OPTICAL remote sensing , *DATABASES , *RECURRENT neural networks , *OBJECT recognition (Computer vision) - Abstract
Research on transformers in remote sensing (RS), which started to increase after 2021, is facing the problem of a relative lack of review. To understand the trends of transformers in RS, we undertook a quantitative analysis of the major research on transformers over the past two years by dividing the application of transformers into eight domains: land use/land cover (LULC) classification, segmentation, fusion, change detection, object detection, object recognition, registration, and others. Quantitative results show that transformers achieve a higher accuracy in LULC classification and fusion, with more stable performance in segmentation and object detection. Combining the analysis results on LULC classification and segmentation, we have found that transformers need more parameters than convolutional neural networks (CNNs). Additionally, further research is also needed regarding inference speed to improve transformers' performance. It was determined that the most common application scenes for transformers in our database are urban, farmland, and water bodies. We also found that transformers are employed in the natural sciences such as agriculture and environmental protection rather than the humanities or economics. Finally, this work summarizes the analysis results of transformers in remote sensing obtained during the research process and provides a perspective on future directions of development. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
31. Deep Learning-Based Nystagmus Detection for BPPV Diagnosis.
- Author
-
Mun, Sae Byeol, Kim, Young Jae, Lee, Ju Hyoung, Han, Gyu Cheol, Cho, Sung Ho, Jin, Seok, and Kim, Kwang Gi
- Subjects
- *
DEEP learning , *BENIGN paroxysmal positional vertigo , *NYSTAGMUS , *CONVOLUTIONAL neural networks - Abstract
In this study, we propose a deep learning-based nystagmus detection algorithm using video oculography (VOG) data to diagnose benign paroxysmal positional vertigo (BPPV). Various deep learning architectures were utilized to develop and evaluate nystagmus detection models. Among the four deep learning architectures used in this study, the CNN1D model proposed as a nystagmus detection model demonstrated the best performance, exhibiting a sensitivity of 94.06 ± 0.78%, specificity of 86.39 ± 1.31%, precision of 91.34 ± 0.84%, accuracy of 91.02 ± 0.66%, and an F1-score of 92.68 ± 0.55%. These results indicate the high accuracy and generalizability of the proposed nystagmus diagnosis algorithm. In conclusion, this study validates the practicality of deep learning in diagnosing BPPV and offers avenues for numerous potential applications of deep learning in the medical diagnostic sector. The findings of this research underscore its importance in enhancing diagnostic accuracy and efficiency in healthcare. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
32. Detection Method of Epileptic Seizures Using a Neural Network Model Based on Multimodal Dual-Stream Networks.
- Author
-
Wang, Baiyang, Xu, Yidong, Peng, Siyu, Wang, Hongjun, and Li, Fang
- Subjects
- *
ARTIFICIAL neural networks , *CONVOLUTIONAL neural networks , *EPILEPSY , *ELECTROENCEPHALOGRAPHY , *FREQUENCY spectra , *NEUROLOGICAL disorders - Abstract
Epilepsy is a common neurological disorder, and its diagnosis mainly relies on the analysis of electroencephalogram (EEG) signals. However, the raw EEG signals contain limited recognizable features, and in order to increase the recognizable features in the input of the network, the differential features of the signals, the amplitude spectrum and the phase spectrum in the frequency domain are extracted to form a two-dimensional feature vector. In order to solve the problem of recognizing multimodal features, a neural network model based on a multimodal dual-stream network is proposed, which uses a mixture of one-dimensional convolution, two-dimensional convolution and LSTM neural networks to extract the spatial features of the EEG two-dimensional vectors and the temporal features of the signals, respectively, and combines the advantages of the two networks, using the hybrid neural network to extract both the temporal and spatial features of the signals at the same time. In addition, a channel attention module was used to focus the model on features related to seizures. Finally, multiple sets of experiments were conducted on the Bonn and New Delhi data sets, and the highest accuracy rates of 99.69% and 97.5% were obtained on the test set, respectively, verifying the superiority of the proposed model in the task of epileptic seizure detection. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
33. Computer-Vision- and Deep-Learning-Based Determination of Flow Regimes, Void Fraction, and Resistance Sensor Data in Microchannel Flow Boiling.
- Author
-
Schepperle, Mark, Junaid, Shayan, and Woias, Peter
- Subjects
- *
POROSITY , *DIGITAL image processing , *CONVOLUTIONAL neural networks , *DEEP learning , *ANNULAR flow , *EBULLITION , *MICROCHANNEL flow , *IMAGE processing - Abstract
The aim of this article is to introduce a novel approach to identifying flow regimes and void fractions in microchannel flow boiling, which is based on binary image segmentation using digital image processing and deep learning. The proposed image processing pipeline uses adaptive thresholding, blurring, gamma correction, contour detection, and histogram comparison to separate vapor from liquid areas, while the deep learning method uses a customized version of a convolutional neural network (CNN) called U-net to extract meaningful features from video frames. Both approaches enabled the automatic detection of flow boiling conditions, such as bubbly, slug, and annular flow, as well as automatic void fraction calculation. Especially CNN demonstrated its ability to deliver fast and dependable results, presenting an appealing substitute to manual feature extraction. The U-net-based CNN was able to segment flow boiling images with a Dice score of 99.1% and classify the above flow regimes with an overall classification accuracy of 91%. In addition, the neural network was able to predict resistance sensor readings from image data and assign them to a flow state with a mean squared error (MSE) < 10−6. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
34. An Interference Mitigation Method for FMCW Radar Based on Time–Frequency Distribution and Dual-Domain Fusion Filtering.
- Author
-
Zhou, Yu, Cao, Ronggang, Zhang, Anqi, and Li, Ping
- Subjects
- *
CONVOLUTIONAL neural networks , *ARTIFICIAL neural networks , *RADIO interference , *RADAR interference , *RADAR , *IMAGE reconstruction , *BISTATIC radar - Abstract
Radio frequency interference (RFI) significantly hampers the target detection performance of frequency-modulated continuous-wave radar. To address the problem and maintain the target echo signal, this paper proposes a priori assumption on the interference component nature in the radar received signal, as well as a method for interference estimation and mitigation via time–frequency analysis. The solution employs Fourier synchrosqueezed transform to implement the radar's beat signal transformation from time domain to time–frequency domain, thus converting the interference mitigation to the task of time–frequency distribution image restoration. The solution proposes the use of image processing based on the dual-tree complex wavelet transform and combines it with the spatial domain-based approach, thereby establishing a dual-domain fusion interference filter for time–frequency distribution images. This paper also presents a convolutional neural network model of structurally improved UNet++, which serves as the interference estimator. The proposed solution demonstrated its capability against various forms of RFI through the simulation experiment and showed a superior interference mitigation performance over other CNN model-based approaches. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
35. Denoising and Baseline Correction Methods for Raman Spectroscopy Based on Convolutional Autoencoder: A Unified Solution.
- Author
-
Han, Ming, Dang, Yu, and Han, Jianda
- Subjects
- *
RAMAN spectroscopy , *CONVOLUTIONAL neural networks , *IMAGE denoising , *NOISE control , *SIGNAL processing - Abstract
Preprocessing plays a key role in Raman spectral analysis. However, classical preprocessing algorithms often have issues with reducing Raman peak intensities and changing the peak shape when processing spectra. This paper introduces a unified solution for preprocessing based on a convolutional autoencoder to enhance Raman spectroscopy data. One is a denoising algorithm that uses a convolutional denoising autoencoder (CDAE model), and the other is a baseline correction algorithm based on a convolutional autoencoder (CAE+ model). The CDAE model incorporates two additional convolutional layers in its bottleneck layer for enhanced noise reduction. The CAE+ model not only adds convolutional layers at the bottleneck but also includes a comparison function after the decoding for effective baseline correction. The proposed models were validated using both simulated spectra and experimental spectra measured with a Raman spectrometer system. Comparing their performance with that of traditional signal processing techniques, the results of the CDAE-CAE+ model show improvements in noise reduction and Raman peak preservation. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
36. Workout Classification Using a Convolutional Neural Network in Ensemble Learning.
- Author
-
Bang, Gi-Seung and Park, Seung-Bo
- Subjects
- *
CONVOLUTIONAL neural networks , *PHYSICAL therapy services , *PHYSICAL fitness , *ARM exercises , *CLASSIFICATION , *HUMAN body - Abstract
To meet the increased demand for home workouts owing to the COVID-19 pandemic, this study proposes a new approach to real-time exercise posture classification based on the convolutional neural network (CNN) in an ensemble learning system. By utilizing MediaPipe, the proposed system extracts the joint coordinates and angles of the human body, which the CNN uses to learn the complex patterns of various exercises. Additionally, this new approach enhances classification performance by combining predictions from multiple image frames using an ensemble learning method. Infinity AI's Fitness Basic Dataset is employed for validation, and the experiments demonstrate high accuracy in classifying exercises such as arm raises, squats, and overhead presses. The proposed model demonstrated its ability to effectively classify exercise postures in real time, achieving high rates in accuracy (92.12%), precision (91.62%), recall (91.64%), and F1 score (91.58%). This indicates its potential application in personalized fitness recommendations and physical therapy services, showcasing the possibility for beneficial use in these fields. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
37. Differentiating Epileptic and Psychogenic Non-Epileptic Seizures Using Machine Learning Analysis of EEG Plot Images.
- Author
-
Fussner, Steven, Boyne, Aidan, Han, Albert, Nakhleh, Lauren A., and Haneef, Zulfi
- Subjects
- *
PSYCHOGENIC nonepileptic seizures , *ELECTROENCEPHALOGRAPHY , *CONVOLUTIONAL neural networks , *MACHINE learning , *HEALTH facilities , *EPILEPSY - Abstract
The treatment of epilepsy, the second most common chronic neurological disorder, is often complicated by the failure of patients to respond to medication. Treatment failure with anti-seizure medications is often due to the presence of non-epileptic seizures. Distinguishing non-epileptic from epileptic seizures requires an expensive and time-consuming analysis of electroencephalograms (EEGs) recorded in an epilepsy monitoring unit. Machine learning algorithms have been used to detect seizures from EEG, typically using EEG waveform analysis. We employed an alternative approach, using a convolutional neural network (CNN) with transfer learning using MobileNetV2 to emulate the real-world visual analysis of EEG images by epileptologists. A total of 5359 EEG waveform plot images from 107 adult subjects across two epilepsy monitoring units in separate medical facilities were divided into epileptic and non-epileptic groups for training and cross-validation of the CNN. The model achieved an accuracy of 86.9% (Area Under the Curve, AUC 0.92) at the site where training data were extracted and an accuracy of 87.3% (AUC 0.94) at the other site whose data were only used for validation. This investigation demonstrates the high accuracy achievable with CNN analysis of EEG plot images and the robustness of this approach across EEG visualization software, laying the groundwork for further subclassification of seizures using similar approaches in a clinical setting. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
38. The Classification of VOCs Based on Sensor Images Using a Lightweight Neural Network for Lung Cancer Diagnosis.
- Author
-
Zha, Chengyuan, Li, Lei, Zhu, Fangting, and Zhao, Yanzhe
- Subjects
- *
CONVOLUTIONAL neural networks , *LUNG cancer , *IMAGE sensors , *CANCER diagnosis , *SENSOR arrays , *VOLATILE organic compounds , *ACETONE - Abstract
The application of artificial intelligence to point-of-care testing (POCT) disease detection has become a hot research field, in which breath detection, which detects the patient's exhaled VOCs, combined with sensor arrays of convolutional neural network (CNN) algorithms as a new lung cancer detection is attracting more researchers' attention. However, the low accuracy, high-complexity computation and large number of parameters make the CNN algorithms difficult to transplant to the embedded system of POCT devices. A lightweight neural network (LTNet) in this work is proposed to deal with this problem, and meanwhile, achieve high-precision classification of acetone and ethanol gases, which are respiratory markers for lung cancer patients. Compared to currently popular lightweight CNN models, such as EfficientNet, LTNet has fewer parameters (32 K) and its training weight size is only 0.155 MB. LTNet achieved an overall classification accuracy of 99.06% and 99.14% in the own mixed gas dataset and the University of California (UCI) dataset, which are both higher than the scores of the six existing models, and it also offers the shortest training (844.38 s and 584.67 s) and inference times (23 s and 14 s) in the same validation sets. Compared to the existing CNN models, LTNet is more suitable for resource-limited POCT devices. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
39. A New Method for Bearing Fault Diagnosis across Machines Based on Envelope Spectrum and Conditional Metric Learning.
- Author
-
Yang, Xu, Yang, Junfeng, Jin, Yupeng, and Liu, Zhongchao
- Subjects
- *
CONVOLUTIONAL neural networks , *BEARINGS (Machinery) , *FAULT diagnosis , *MARGINAL distributions , *DATA distribution - Abstract
In recent years, most research on bearing fault diagnosis has assumed that the source domain and target domain data come from the same machine. The differences in equipment lead to a decrease in diagnostic accuracy. To address this issue, unsupervised domain adaptation techniques have been introduced. However, most cross-device fault diagnosis models overlook the discriminative information under the marginal distribution, which restricts the performance of the models. In this paper, we propose a bearing fault diagnosis method based on envelope spectrum and conditional metric learning. First, envelope spectral analysis is used to extract frequency domain features. Then, to fully utilize the discriminative information from the label distribution, we construct a deep Siamese convolutional neural network based on conditional metric learning to eliminate the data distribution differences and extract common features from the source and target domain data. Finally, dynamic weighting factors are employed to improve the convergence performance of the model and optimize the training process. Experimental analysis is conducted on 12 cross-device tasks and compared with other relevant methods. The results show that the proposed method achieves the best performance on all three evaluation metrics. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
40. Fault Diagnosis of Hydraulic Components Based on Multi-Sensor Information Fusion Using Improved TSO-CNN-BiLSTM.
- Author
-
Zhang, Da, Zheng, Kun, Liu, Fuqi, and Li, Beili
- Subjects
- *
MULTISENSOR data fusion , *FAULT diagnosis , *RANDOM forest algorithms , *FEATURE extraction , *CONVOLUTIONAL neural networks , *PRESSURE sensors - Abstract
In order to realize the accurate and reliable fault diagnosis of hydraulic systems, a diagnostic model based on improved tuna swarm optimization (ITSO), optimized convolutional neural networks (CNNs), and bi-directional long short-term memory (BiLSTM) networks is proposed. Firstly, sensor selection is implemented using the random forest algorithm to select useful signals from six kinds of physical or virtual sensors including pressure, temperature, flow rate, vibration, motor power, and motor efficiency coefficient. After that, fused features are extracted by CNN, and then, BiLSTM is applied to learn the forward and backward information contained in the data. The ITSO algorithm is adopted to adaptively optimize the learning rate, regularization coefficient, and node number to obtain the optimal CNN-BiLSTM network. Improved Chebyshev chaotic mapping and the nonlinear reduction strategy are adopted to improve population initialization and individual position updating, further promoting the optimization effect of TSO. The experimental results show that the proposed method can automatically extract fusion features and effectively utilize multi-sensor information. The diagnostic accuracies of the plunger pump, cooler, throttle valve, and accumulator are 99.07%, 99.4%, 98.81%, and 98.51%, respectively. The diagnostic results of noisy data with 0 dB, 5 dB, and 10 dB signal-to-noise ratios (SNRs) show that the ITSO-CNN-BiLSTM model has good robustness to noise interference. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
41. Convolutional Neural Network-Based Pattern Recognition of Partial Discharge in High-Speed Electric-Multiple-Unit Cable Termination.
- Author
-
Sun, Chuanming, Wu, Guangning, Pan, Guixiang, Zhang, Tingyu, Li, Jiali, Jiao, Shibo, Liu, Yong-Chao, Chen, Kui, Liu, Kai, Xin, Dongli, and Gao, Guoqiang
- Subjects
- *
PATTERN recognition systems , *PARTIAL discharges , *CONVOLUTIONAL neural networks , *ELECTRIC multiple units , *RADIAL basis functions - Abstract
Partial discharge detection is considered a crucial technique for evaluating insulation performance and identifying defect types in cable terminals of high-speed electric multiple units (EMUs). In this study, terminal samples exhibiting four typical defects were prepared from high-speed EMUs. A cable discharge testing system, utilizing high-frequency current sensing, was developed to collect discharge signals, and datasets corresponding to these defects were established. This study proposes the use of the convolutional neural network (CNN) for the classification of discharge signals associated with specific defects, comparing this method with two existing neural network (NN)-based classification models that employ the back-propagation NN and the radial basis function NN, respectively. The comparative results demonstrate that the CNN-based model excels in accurately identifying signals from various defect types in the cable terminals of high-speed EMUs, surpassing the two existing NN-based classification models. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
42. Real-Time Ferrogram Segmentation of Wear Debris Using Multi-Level Feature Reused Unet.
- Author
-
You, Jie, Fan, Shibo, Yu, Qinghai, Wang, Lianfu, Zhang, Zhou, and Zong, Zheying
- Subjects
- *
CONVOLUTIONAL neural networks , *FAULT diagnosis , *LUBRICATING oils , *IMAGE segmentation - Abstract
The real-time monitoring and fault diagnosis of modern machinery and equipment impose higher demands on equipment maintenance, with the extraction of morphological characteristics of wear debris in lubricating oil emerging as a critical approach for real-time monitoring of wear, holding significant importance in the field. The online visual ferrograph (OLVF) technique serves as a representative method in this study. Various semantic segmentation approaches, such as DeepLabV3+, PSPNet, Segformer, Unet, and other models, are employed to process the oil wear particle image for conducting comparative experiments. In order to accurately segment the minute wear debris in oil abrasive images and mitigate the influence of reflection and bubbles, we propose a multi-level feature reused Unet (MFR Unet) that enhances the residual link strategy of Unet for improved identification of tiny wear debris in ferrograms, leading to superior segmentation results. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
43. Image Filtering to Improve Maize Tassel Detection Accuracy Using Machine Learning Algorithms.
- Author
-
Rodene, Eric, Fernando, Gayara Demini, Piyush, Ved, Ge, Yufeng, Schnable, James C., Ghosh, Souparno, and Yang, Jinliang
- Subjects
- *
MACHINE learning , *PLANT breeding , *CONVOLUTIONAL neural networks , *IMAGE segmentation , *AERIAL photography - Abstract
Unmanned aerial vehicle (UAV)-based imagery has become widely used to collect time-series agronomic data, which are then incorporated into plant breeding programs to enhance crop improvements. To make efficient analysis possible, in this study, by leveraging an aerial photography dataset for a field trial of 233 different inbred lines from the maize diversity panel, we developed machine learning methods for obtaining automated tassel counts at the plot level. We employed both an object-based counting-by-detection (CBD) approach and a density-based counting-by-regression (CBR) approach. Using an image segmentation method that removes most of the pixels not associated with the plant tassels, the results showed a dramatic improvement in the accuracy of object-based (CBD) detection, with the cross-validation prediction accuracy (r2) peaking at 0.7033 on a detector trained with images with a filter threshold of 90. The CBR approach showed the greatest accuracy when using unfiltered images, with a mean absolute error (MAE) of 7.99. However, when using bootstrapping, images filtered at a threshold of 90 showed a slightly better MAE (8.65) than the unfiltered images (8.90). These methods will allow for accurate estimates of flowering-related traits and help to make breeding decisions for crop improvement. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
44. A Global Seawater Density Distribution Model Using a Convolutional Neural Network.
- Author
-
Qin Liu, Liyan Li, Yan Zhou, Shiwen Zhang, Yuliang Liu, and Xinwei Wang
- Abstract
Seawater density is an important physical property in oceanography that affects the accuracy of calculations such as gravity fields and tidal potentials and the calibration of acoustic and optical oceanographic sensors. In related studies, constant density values are frequently used, which can introduce significant errors. Therefore, this study employs a basic convolutional neural network model to construct a comprehensive model showing the seawater density distribution across the globe. The model takes into account depth, latitude, longitude, and month as inputs. Numerous real seawater datasets were used to train the model, and it has been shown that the model has an absolute mean error and root mean square error of less than 1 kg/m3 in 99% of the test set samples. The model effectively demonstrates the influence of input parameters on the distribution of seawater density. In this paper, we present a newly developed global model for distributing seawater density which is both comprehensive and accurate, surpassing previous models. The utilization of the model presented in this paper for estimating seawater density can minimize errors in theoretical ocean models and serve as a foundation for designing and analyzing ocean exploration systems. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
45. An Open-Circuit Fault Diagnosis Method for Three-Level Neutral Point Clamped Inverters Based on Multi-Scale Shuffled Convolutional Neural Network.
- Author
-
Yan, Yan, Wu, Jiaqi, Cao, Yanfei, Liu, Bo, Li, Chen, and Shi, Tingna
- Subjects
- *
CONVOLUTIONAL neural networks , *FAULT diagnosis , *DIAGNOSIS methods - Abstract
This study constructs a power switching device open-circuit fault diagnosis model for a three-level neutral point clamped inverter based on the multi-scale shuffled convolutional neural network (MSSCNN) and extracts and classifies the fault information contained in the output current of inverters. The model employs depthwise separable convolution and channel shuffle techniques to improve diagnostic accuracy and reduce model complexity. The experimental results show that the new model has lower model complexity, better noise resistance and higher average diagnostic accuracy compared with fault diagnosis models based on CNN, ResNet, ShuffleNet V2 and Mobilenet V3 networks. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
46. COVID-19 Infection Percentage Estimation from Computed Tomography Scans: Results and Insights from the International Per-COVID-19 Challenge.
- Author
-
Bougourzi, Fares, Distante, Cosimo, Dornaika, Fadi, Taleb-Ahmed, Abdelmalik, Hadid, Abdenour, Chaudhary, Suman, Yang, Wanting, Qiang, Yan, Anwar, Talha, Breaban, Mihaela Elena, Hsu, Chih-Chung, Tai, Shen-Chieh, Chen, Shao-Ning, Tricarico, Davide, Chaudhry, Hafiza Ayesha Hoor, Fiandrotti, Attilio, Grangetto, Marco, Spatafora, Maria Ausilia Napoli, Ortis, Alessandro, and Battiato, Sebastiano
- Subjects
- *
COVID-19 , *COMPUTED tomography , *COVID-19 pandemic , *CONVOLUTIONAL neural networks , *DEEP learning - Abstract
COVID-19 analysis from medical imaging is an important task that has been intensively studied in the last years due to the spread of the COVID-19 pandemic. In fact, medical imaging has often been used as a complementary or main tool to recognize the infected persons. On the other hand, medical imaging has the ability to provide more details about COVID-19 infection, including its severity and spread, which makes it possible to evaluate the infection and follow-up the patient's state. CT scans are the most informative tool for COVID-19 infection, where the evaluation of COVID-19 infection is usually performed through infection segmentation. However, segmentation is a tedious task that requires much effort and time from expert radiologists. To deal with this limitation, an efficient framework for estimating COVID-19 infection as a regression task is proposed. The goal of the Per-COVID-19 challenge is to test the efficiency of modern deep learning methods on COVID-19 infection percentage estimation (CIPE) from CT scans. Participants had to develop an efficient deep learning approach that can learn from noisy data. In addition, participants had to cope with many challenges, including those related to COVID-19 infection complexity and crossdataset scenarios. This paper provides an overview of the COVID-19 infection percentage estimation challenge (Per-COVID-19) held at MIA-COVID-2022. Details of the competition data, challenges, and evaluation metrics are presented. The best performing approaches and their results are described and discussed. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
47. Mask-Pyramid Network: A Novel Panoptic Segmentation Method.
- Author
-
Xian, Peng-Fei, Po, Lai-Man, Xiong, Jing-Jing, Zhao, Yu-Zhi, Yu, Wing-Yin, and Cheung, Kwok-Wai
- Subjects
- *
CONVOLUTIONAL neural networks , *LOGITS , *PIXELS - Abstract
In this paper, we introduce a novel panoptic segmentation method called the Mask-Pyramid Network. Existing Mask RCNN-based methods first generate a large number of box proposals and then filter them at each feature level, which requires a lot of computational resources, while most of the box proposals are suppressed and discarded in the Non-Maximum Suppression process. Additionally, for panoptic segmentation, it is a problem to properly fuse the semantic segmentation results with the Mask RCNN-produced instance segmentation results. To address these issues, we propose a new mask pyramid mechanism to distinguish objects and generate much fewer proposals by referring to existing segmented masks, so as to reduce computing resource consumption. The Mask-Pyramid Network generates object proposals and predicts masks from larger to smaller sizes. It records the pixel area occupied by the larger object masks, and then only generates proposals on the unoccupied areas. Each object mask is represented as a H × W × 1 logit, which fits well in format with the semantic segmentation logits. By applying SoftMax to the concatenated semantic and instance segmentation logits, it is easy and natural to fuse both segmentation results. We empirically demonstrate that the proposed Mask-Pyramid Network achieves comparable accuracy performance on the Cityscapes and COCO datasets. Furthermore, we demonstrate the computational efficiency of the proposed method and obtain competitive results. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
48. Benchmarking Artificial Neural Network Architectures for High-Performance Spiking Neural Networks.
- Author
-
Islam, Riadul, Majurski, Patrick, Kwon, Jun, Sharma, Anurag, and Tummala, Sri Ranga Sai Krishna
- Subjects
- *
ARTIFICIAL neural networks , *CONVOLUTIONAL neural networks , *GRAPH algorithms , *COMPUTER systems , *ENERGY consumption , *PARALLEL algorithms , *BIOLOGICALLY inspired computing - Abstract
Organizations managing high-performance computing systems face a multitude of challenges, including overarching concerns such as overall energy consumption, microprocessor clock frequency limitations, and the escalating costs associated with chip production. Evidently, processor speeds have plateaued over the last decade, persisting within the range of 2 GHz to 5 GHz. Scholars assert that brain-inspired computing holds substantial promise for mitigating these challenges. The spiking neural network (SNN) particularly stands out for its commendable power efficiency when juxtaposed with conventional design paradigms. Nevertheless, our scrutiny has brought to light several pivotal challenges impeding the seamless implementation of large-scale neural networks (NNs) on silicon. These challenges encompass the absence of automated tools, the need for multifaceted domain expertise, and the inadequacy of existing algorithms to efficiently partition and place extensive SNN computations onto hardware infrastructure. In this paper, we posit the development of an automated tool flow capable of transmuting any NN into an SNN. This undertaking involves the creation of a novel graph-partitioning algorithm designed to strategically place SNNs on a network-on-chip (NoC), thereby paving the way for future energy-efficient and high-performance computing paradigms. The presented methodology showcases its effectiveness by successfully transforming ANN architectures into SNNs with a marginal average error penalty of merely 2.65%. The proposed graph-partitioning algorithm enables a 14.22% decrease in inter-synaptic communication and an 87.58% reduction in intra-synaptic communication, on average, underscoring the effectiveness of the proposed algorithm in optimizing NN communication pathways. Compared to a baseline graph-partitioning algorithm, the proposed approach exhibits an average decrease of 79.74% in latency and a 14.67% reduction in energy consumption. Using existing NoC tools, the energy-latency product of SNN architectures is, on average, 82.71% lower than that of the baseline architectures. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
49. Characterization of Partial Discharges in Dielectric Oils Using High-Resolution CMOS Image Sensor and Convolutional Neural Networks.
- Author
-
Monzón-Verona, José Miguel, González-Domínguez, Pablo, and García-Alonso, Santiago
- Subjects
- *
CMOS image sensors , *PARTIAL discharges , *CONVOLUTIONAL neural networks , *PARTIAL discharge measurement , *IMAGE recognition (Computer vision) - Abstract
In this work, an exhaustive analysis of the partial discharges that originate in the bubbles present in dielectric mineral oils is carried out. To achieve this, a low-cost, high-resolution CMOS image sensor is used. Partial discharge measurements using that image sensor are validated by a standard electrical detection system that uses a discharge capacitor. In order to accurately identify the images corresponding to partial discharges, a convolutional neural network is trained using a large set of images captured by the image sensor. An image classification model is also developed using deep learning with a convolutional network based on a TensorFlow and Keras model. The classification results of the experiments show that the accuracy achieved by our model is around 95% on the validation set and 82% on the test set. As a result of this work, a non-destructive diagnosis method has been developed that is based on the use of an image sensor and the design of a convolutional neural network. This approach allows us to obtain information about the state of mineral oils before breakdown occurs, providing a valuable tool for the evaluation and maintenance of these dielectric oils. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
50. Quantitative Detection of Pipeline Cracks Based on Ultrasonic Guided Waves and Convolutional Neural Network.
- Author
-
Shen, Yuchi, Wu, Jing, Chen, Junfeng, Zhang, Weiwei, Yang, Xiaolin, and Ma, Hongwei
- Subjects
- *
CONVOLUTIONAL neural networks , *MULTILAYER perceptrons , *ULTRASONIC waves , *SIGNAL detection - Abstract
In this study, a quantitative detection method of pipeline cracks based on a one-dimensional convolutional neural network (1D-CNN) was developed using the time-domain signal of ultrasonic guided waves and the crack size of the pipeline as the input and output, respectively. Pipeline ultrasonic guided wave detection signals under different crack defect conditions were obtained via numerical simulations and experiments, and these signals were input as features into a multi-layer perceptron and one-dimensional convolutional neural network (1D-CNN) for training. The results revealed that the 1D-CNN performed better in the quantitative analysis of pipeline crack defects, with an error of less than 2% in the simulated and experimental data, and it could effectively evaluate the size of crack defects from the echo signals under different frequency excitations. Thus, by combining the ultrasonic guided wave detection technology and CNN, a quantitative analysis of pipeline crack defects can be effectively realized. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.