1,584 results
Search Results
2. Amur Tiger Individual Identification Based on the Improved InceptionResNetV2.
- Author
-
Wu, Ling, Jinma, Yongyi, Wang, Xinyang, Yang, Feng, Xu, Fu, Cui, Xiaohui, and Sun, Qiao
- Subjects
ARTIFICIAL neural networks ,CONVOLUTIONAL neural networks ,OBJECT recognition (Computer vision) ,RECOGNITION (Psychology) ,TIGERS - Abstract
Simple Summary: Accurate identification of individual Amur tigers is vital for their conservation, as it helps us understand their population and distribution. Existing identification methods often fall short in accuracy, and our study focuses on creating a more accurate method for identifying individual Amur tigers using advanced deep learning techniques. We improved an existing neural network model called InceptionResNetV2 by adding features like dropout layers and dual-attention mechanisms to better capture the unique stripe patterns of each tiger and reduce errors during training. We tested our model on a large dataset of tiger images and found it to be highly effective, achieving an average recognition accuracy of over 95% for different body parts, with left stripes reaching the highest 99.37%. This method significantly outperforms previous models and provides a reliable tool for wildlife researchers and conservationists to monitor and protect Amur tigers. By improving the ability to track individual tigers, our research offers practical benefits for preserving this endangered species and enhancing wildlife management practices globally. Accurate and intelligent identification of rare and endangered individuals of flagship wildlife species, such as Amur tiger (Panthera tigris altaica), is crucial for understanding population structure and distribution, thereby facilitating targeted conservation measures. However, many mathematical modeling methods, including deep learning models, often yield unsatisfactory results. This paper proposes an individual recognition method for Amur tigers based on an improved InceptionResNetV2 model. Initially, the YOLOv5 model is employed to automatically detect and segment facial, left stripe, and right stripe areas from images of 107 individual Amur tigers, achieving a high average classification accuracy of 97.3%. By introducing a dropout layer and a dual-attention mechanism, we enhance the InceptionResNetV2 model to better capture the stripe features of individual tigers at various granularities and reduce overfitting during training. Experimental results demonstrate that our model outperforms other classic models, offering optimal recognition accuracy and ideal loss changes. The average recognition accuracy for different body part features is 95.36%, with left stripes achieving a peak accuracy of 99.37%. These results highlight the model's excellent recognition capabilities. Our research provides a valuable and practical approach to the individual identification of rare and endangered animals, offering significant potential for improving conservation efforts. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. Special Issue: Design and Control of a Bio-Inspired Robot.
- Author
-
Zhao, Mingguo and Hu, Biao
- Subjects
ROBOT control systems ,ARTIFICIAL neural networks ,BIOENGINEERING ,CONVOLUTIONAL neural networks ,BIOLOGICALLY inspired computing ,BIOMIMETICS - Abstract
This document is a special issue of the journal Biomimetics, focusing on the design and control of bio-inspired robots. It explores various aspects of bionics in robotics, including robot design, perception, control, and decision-making, as well as incorporating neuroscience and brain science. The issue covers a wide range of topics, such as stiffness adjustment for continuum robots, biomimetic motor control, stroke rehabilitation, reinforcement learning for quadruped robots, improved spiking neural networks, energy-efficient image segmentation, kinematics analysis, synthetic nervous systems for robotic control, online running-gait generation, and bio-inspired perception and navigation for service robots. The document also discusses specific papers within the special issue that address challenges in robotic perception and navigation, legged robot control, and motion control of continuum robots and robotic arms. It concludes by announcing plans for a second special issue on related topics. [Extracted from the article]
- Published
- 2024
- Full Text
- View/download PDF
4. Special Issue on "Process Monitoring and Fault Diagnosis".
- Author
-
Ji, Cheng and Sun, Wei
- Subjects
ARTIFICIAL neural networks ,REMAINING useful life ,CONVOLUTIONAL neural networks ,PATTERN recognition systems ,TRANSFORMER models ,DEEP learning ,STATISTICAL learning ,WATER pipelines - Abstract
This document is a summary of a special issue of the journal Processes titled "Process Monitoring and Fault Diagnosis." The issue explores the application of data analytic techniques to enhance stable operation and safety in chemical processes and related industries. The collection of research papers covers various topics, including process fault detection, bearing fault diagnosis, remaining useful life prediction, and more. The papers introduce cutting-edge methodologies and demonstrate their reliability through validation. The issue aims to foster communication and the development of advanced process monitoring techniques. [Extracted from the article]
- Published
- 2024
- Full Text
- View/download PDF
5. Maize Leaf Disease Recognition Based on Improved Convolutional Neural Network ShuffleNetV2.
- Author
-
Zhou, Hanmi, Su, Yumin, Chen, Jiageng, Li, Jichen, Ma, Linshuang, Liu, Xingyi, Lu, Sibo, and Wu, Qi
- Subjects
CONVOLUTIONAL neural networks ,ARTIFICIAL neural networks ,CORN diseases ,CORN ,PRECISION farming ,AGRICULTURAL development - Abstract
The occurrence of maize diseases is frequent but challenging to manage. Traditional identification methods have low accuracy and complex model structures with numerous parameters, making them difficult to implement on mobile devices. To address these challenges, this paper proposes a corn leaf disease recognition model SNMPF based on convolutional neural network ShuffleNetV2. In the down-sampling module of the ShuffleNet model, the max pooling layer replaces the deep convolutional layer to perform down-sampling. This improvement helps to extract key features from images, reduce the overfitting of the model, and improve the model's generalization ability. In addition, to enhance the model's ability to express features in complex backgrounds, the Sim AM attention mechanism was introduced. This mechanism enables the model to adaptively adjust focus and pay more attention to local discriminative features. The results on a maize disease image dataset demonstrate that the SNMPF model achieves a recognition accuracy of 98.40%, representing a 4.1 percentage point improvement over the original model, while its size is only 1.56 MB. Compared with existing convolutional neural network models such as EfficientNet, MobileViT, EfficientNetV2, RegNet, and DenseNet, this model offers higher accuracy and a more compact size. As a result, it can automatically detect and classify maize leaf diseases under natural field conditions, boasting high-precision recognition capabilities. Its accurate identification results provide scientific guidance for preventing corn leaf disease and promote the development of precision agriculture. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. Online Signature Biometrics for Mobile Devices.
- Author
-
Roszczewska, Katarzyna and Niewiadomska-Szynkiewicz, Ewa
- Subjects
ARTIFICIAL neural networks ,CONVOLUTIONAL neural networks ,HANDWRITING recognition (Computer science) ,DATABASES ,BIOMETRY ,BIOMETRIC identification - Abstract
This paper addresses issues concerning biometric authentication based on handwritten signatures. Our research aimed to check whether a handwritten signature acquired with a mobile device can effectively verify a user's identity. We present a novel online signature verification method using coordinates of points and pressure values at each point collected with a mobile device. Convolutional neural networks are used for signature verification. In this paper, three neural network models are investigated, i.e., two self-made light SigNet and SigNetExt models and the VGG-16 model commonly used in image processing. The convolutional neural networks aim to determine whether the acquired signature sample matches the class declared by the signer. Thus, the scenario of closed set verification is performed. The effectiveness of our method was tested on signatures acquired with mobile phones. We used the subset of the multimodal database, MobiBits, that was captured using a custom-made application and consists of samples acquired from 53 people of diverse ages. The experimental results on accurate data demonstrate that developed architectures of deep neural networks can be successfully used for online handwritten signature verification. We achieved an equal error rate (EER) of 0.63% for random forgeries and 6.66% for skilled forgeries. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. Editorial for the Special Issue "Machine Learning in Computer Vision and Image Sensing: Theory and Applications".
- Author
-
Chakraborty, Subrata and Pradhan, Biswajeet
- Subjects
COMPUTER vision ,MACHINE learning ,ARTIFICIAL neural networks ,DEEP learning ,CONVOLUTIONAL neural networks ,SIGNAL processing ,GAIT in humans - Abstract
This document is an editorial for a special issue titled "Machine Learning in Computer Vision and Image Sensing: Theory and Applications." The editorial highlights the diverse applications of machine learning (ML) models in various domains such as medical imaging, signal processing, remote sensing, and human activity detection. The special issue includes 11 papers that cover topics such as image segmentation, fluvial navigation, Alzheimer's disease classification, pneumothorax detection, lung cancer malignancy prediction, amniotic fluid volume detection, COVID-19 detection, and Parkinson's disease detection. The papers showcase the progress and potential of ML models in computer vision applications and provide valuable insights for future research. [Extracted from the article]
- Published
- 2024
- Full Text
- View/download PDF
8. Special Issue: Machine Learning and Data Analysis.
- Author
-
Michalak, Marcin
- Subjects
DEEP learning ,CONVOLUTIONAL neural networks ,PATTERN recognition systems ,DATA analysis ,ARTIFICIAL neural networks ,CREDIT card fraud ,MACHINE learning - Abstract
This Special Issue contains 2 reviews and 17 research papers related to the following topics: Time series forecasting [[1], [3], [5]]; Image analysis [[6]]; Medical applications [[7]]; Knowledge graph analysis [[9]]; Cybersecurity [[11], [13]]; Traffic analysis [[14]]; Agriculture [[16]]; Environmental data analysis [[17]]. In [[2]], a time series analysis was applied in a different manner: their prediction of the high stock dividend (HSD) was based on a sequence of typical machine learning approaches instead of state-of-the-art methods such as ARIMA or SMA. The authors of [[1]] focused on short time series forecasting in the domain of crime data (thefts, shoplifting, vehicular crimes, and burglaries in Mexico). In this paper, the authors attempt to answer the questions what is air traffic complexity and which air traffic data variables have greater impacts on increases in complexity?. [Extracted from the article]
- Published
- 2023
- Full Text
- View/download PDF
9. Performance Improvement of Speech Emotion Recognition Systems by Combining 1D CNN and LSTM with Data Augmentation.
- Author
-
Pan, Shing-Tai and Wu, Han-Jui
- Subjects
AUTOMATIC speech recognition ,DATA augmentation ,ARTIFICIAL neural networks ,MACHINE learning ,EMOTION recognition ,CONVOLUTIONAL neural networks - Abstract
In recent years, the increasing popularity of smart mobile devices has made the interaction between devices and users, particularly through voice interaction, more crucial. By enabling smart devices to better understand users' emotional states through voice data, it becomes possible to provide more personalized services. This paper proposes a novel machine learning model for speech emotion recognition called CLDNN, which combines convolutional neural networks (CNN), long short-term memory neural networks (LSTM), and deep neural networks (DNN). To design a system that closely resembles the human auditory system in recognizing audio signals, this article uses the Mel-frequency cepstral coefficients (MFCCs) of audio data as the input of the machine learning model. First, the MFCCs of the voice signal are extracted as the input of the model. Local feature learning blocks (LFLBs) composed of one-dimensional CNNs are employed to calculate the feature values of the data. As audio signals are time-series data, the resulting feature values from LFLBs are then fed into the LSTM layer to enhance learning on the time-series level. Finally, fully connected layers are used for classification and prediction. The experimental evaluation of the proposed model utilizes three databases: RAVDESS, EMO-DB, and IEMOCAP. The results demonstrate that the LSTM model effectively models the features extracted from the 1D CNN due to the time-series characteristics of speech signals. Additionally, the data augmentation method applied in this paper proves beneficial in improving the recognition accuracy and stability of the systems for different databases. Furthermore, according to the experimental results, the proposed system achieves superior recognition rates compared to related research in speech emotion recognition. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
10. A Convolutional Neural Network with Hyperparameter Tuning for Packet Payload-Based Network Intrusion Detection.
- Author
-
Boulaiche, Ammar, Haddad, Sofiane, and Lemouari, Ali
- Subjects
ARTIFICIAL neural networks ,CONVOLUTIONAL neural networks ,PATTERN recognition systems ,COMPUTER network traffic ,METAHEURISTIC algorithms ,INTRUSION detection systems (Computer security) - Abstract
In the last few years, the use of convolutional neural networks (CNNs) in intrusion detection domains has attracted more and more attention. However, their results in this domain have not lived up to expectations compared to the results obtained in other domains, such as image classification and video analysis. This is mainly due to the datasets used, which contain preprocessed features that are not compatible with convolutional neural networks, as they do not allow a full exploit of all the information embedded in the original network traffic. With the aim of overcoming these issues, we propose in this paper a new efficient convolutional neural network model for network intrusion detection based on raw traffic data (pcap files) rather than preprocessed data stored in CSV files. The novelty of this paper lies in the proposal of a new method for adapting the raw network traffic data to the most suitable format for CNN models, which allows us to fully exploit the strengths of CNNs in terms of pattern recognition and spatial analysis, leading to more accurate and effective results. Additionally, to further improve its detection performance, the structure and hyperparameters of our proposed CNN-based model are automatically adjusted using the self-adaptive differential evolution (SADE) metaheuristic, in which symmetry plays an essential role in balancing the different phases of the algorithm, so that each phase can contribute in an equal and efficient way to finding optimal solutions. This helps to make the overall performance more robust and efficient when solving optimization problems. The experimental results on three datasets, KDD-99, UNSW-NB15, and CIC-IDS2017, show a strong symmetry between the frequency values implemented in the images built for each network traffic and the different attack classes. This was confirmed by a good predictive accuracy that goes well beyond similar competing models in the literature. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. FN-GNN: A Novel Graph Embedding Approach for Enhancing Graph Neural Networks in Network Intrusion Detection Systems.
- Author
-
Tran, Dinh-Hau and Park, Minho
- Subjects
ARTIFICIAL neural networks ,GRAPH neural networks ,RECURRENT neural networks ,CONVOLUTIONAL neural networks ,DEEP learning ,INTRUSION detection systems (Computer security) - Abstract
With the proliferation of the Internet, network complexities for both commercial and state organizations have significantly increased, leading to more sophisticated and harder-to-detect network attacks. This evolution poses substantial challenges for intrusion detection systems, threatening the cybersecurity of organizations and national infrastructure alike. Although numerous deep learning techniques such as convolutional neural networks (CNNs), recurrent neural networks (RNNs), and graph neural networks (GNNs) have been applied to detect various network attacks, they face limitations due to the lack of standardized input data, affecting model accuracy and performance. This paper proposes a novel preprocessing method for flow data from network intrusion detection systems (NIDSs), enhancing the efficacy of a graph neural network model in malicious flow detection. Our approach initializes graph nodes with data derived from flow features and constructs graph edges through the analysis of IP relationships within the system. Additionally, we propose a new graph model based on the combination of the graph neural network (GCN) model and SAGEConv, a variant of the GraphSAGE model. The proposed model leverages the strengths while addressing the limitations encountered by the previous models. Evaluations on two IDS datasets, CICIDS-2017 and UNSW-NB15, demonstrate that our model outperforms existing methods, offering a significant advancement in the detection of network threats. This work not only addresses a critical gap in the standardization of input data for deep learning models in cybersecurity but also proposes a scalable solution for improving the intrusion detection accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. Human Recognition: The Utilization of Face, Voice, Name and Interactions—An Extended Editorial.
- Author
-
Gainotti, Guido
- Subjects
FACE perception ,FACIAL expression & emotions (Psychology) ,VOICE disorders ,CONVOLUTIONAL neural networks ,ARTIFICIAL neural networks ,AUTOMATIC speech recognition ,FUSIFORM gyrus - Abstract
This document is an editorial that summarizes several research papers on the topic of human recognition and the interactions between different channels of recognition, such as voice and face processing. The papers explore various aspects of this topic, including voice processing in challenging listening conditions, the role of familiarity in voice and face recognition, and the functional reorganization of neural systems in blind individuals. The papers highlight the importance of familiarity factors in recognition, the role of brain asymmetries, and the early integration of voice and face processing. The findings suggest that voice and face recognition are closely linked and influenced by familiarity, and that the right temporal lobe may play a significant role in vocal processing abilities in blind individuals. [Extracted from the article]
- Published
- 2024
- Full Text
- View/download PDF
13. Application of Artificial Intelligence in Hydraulic Engineering.
- Author
-
Ma, Chunhui, Cheng, Lin, and Yang, Jie
- Subjects
ARTIFICIAL intelligence ,HYDRAULIC engineering ,ARTIFICIAL neural networks ,NETWORK governance ,WATER conservation projects ,CONVOLUTIONAL neural networks ,CONCRETE dams ,SOIL permeability - Abstract
This document provides a summary of a special issue on the application of artificial intelligence (AI) in hydraulic engineering. The papers in this issue cover a range of topics including safety monitoring, optimization algorithms, nondestructive sensors, numerical simulations, and experiments. The research presented in these papers is significant for analyzing and managing the safety of major structures in reservoir dams. The authors hope that these papers will be of interest to researchers, designers, and practitioners in the field and will inspire further research. [Extracted from the article]
- Published
- 2024
- Full Text
- View/download PDF
14. SE-VisionTransformer: Hybrid Network for Diagnosing Sugarcane Leaf Diseases Based on Attention Mechanism.
- Author
-
Sun, Cuimin, Zhou, Xingzhi, Zhang, Menghua, and Qin, An
- Subjects
LEAF anatomy ,ARTIFICIAL neural networks ,CONVOLUTIONAL neural networks ,SUGAR plantations ,SUGARCANE ,SUPPORT vector machines ,HEBBIAN memory - Abstract
Sugarcane is an important raw material for sugar and chemical production. However, in recent years, various sugarcane diseases have emerged, severely impacting the national economy. To address the issue of identifying diseases in sugarcane leaf sections, this paper proposes the SE-VIT hybrid network. Unlike traditional methods that directly use models for classification, this paper compares threshold, K-means, and support vector machine (SVM) algorithms for extracting leaf lesions from images. Due to SVM's ability to accurately segment these lesions, it is ultimately selected for the task. The paper introduces the SE attention module into ResNet-18 (CNN), enhancing the learning of inter-channel weights. After the pooling layer, multi-head self-attention (MHSA) is incorporated. Finally, with the inclusion of 2D relative positional encoding, the accuracy is improved by 5.1%, precision by 3.23%, and recall by 5.17%. The SE-VIT hybrid network model achieves an accuracy of 97.26% on the PlantVillage dataset. Additionally, when compared to four existing classical neural network models, SE-VIT demonstrates significantly higher accuracy and precision, reaching 89.57% accuracy. Therefore, the method proposed in this paper can provide technical support for intelligent management of sugarcane plantations and offer insights for addressing plant diseases with limited datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
15. Detecting Selected Instruments in the Sound Signal.
- Author
-
Kostrzewa, Daniel, Szwajnoch, Paweł, Brzeski, Robert, and Mrozek, Dariusz
- Subjects
CONVOLUTIONAL neural networks ,ARTIFICIAL neural networks ,SOUND recording executives & producers ,INFORMATION retrieval ,DATABASES - Abstract
Detecting instruments in a music signal is often used in database indexing, song annotation, and creating applications for musicians and music producers. Therefore, effective methods that automatically solve this issue need to be created. In this paper, the mentioned task is solved using mel-frequency cepstral coefficients (MFCC) and various architectures of artificial neural networks. The authors' contribution to the development of automatic instrument detection covers the methods used, particularly the neural network architectures and the voting committees created. All these methods were evaluated, and the results are presented and discussed in the paper. The proposed automatic instrument detection methods show that the best classification quality was obtained for an extensive model, which is the so-called committee of voting classifiers. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. Multi-Directional Long-Term Recurrent Convolutional Network for Road Situation Recognition.
- Author
-
Dofitas Jr., Cyreneo, Gil, Joon-Min, and Byun, Yung-Cheol
- Subjects
ARTIFICIAL neural networks ,RECURRENT neural networks ,PEDESTRIANS ,ROAD safety measures ,CONVOLUTIONAL neural networks ,DEEP learning - Abstract
Understanding road conditions is essential for implementing effective road safety measures and driving solutions. Road situations encompass the day-to-day conditions of roads, including the presence of vehicles and pedestrians. Surveillance cameras strategically placed along streets have been instrumental in monitoring road situations and providing valuable information on pedestrians, moving vehicles, and objects within road environments. However, these video data and information are stored in large volumes, making analysis tedious and time-consuming. Deep learning models are increasingly utilized to monitor vehicles and identify and evaluate road and driving comfort situations. However, the current neural network model requires the recognition of situations using time-series video data. In this paper, we introduced a multi-directional detection model for road situations to uphold high accuracy. Deep learning methods often integrate long short-term memory (LSTM) into long-term recurrent network architectures. This approach effectively combines recurrent neural networks to capture temporal dependencies and convolutional neural networks (CNNs) to extract features from extensive video data. In our proposed method, we form a multi-directional long-term recurrent convolutional network approach with two groups equipped with CNN and two layers of LSTM. Additionally, we compare road situation recognition using convolutional neural networks, long short-term networks, and long-term recurrent convolutional networks. The paper presents a method for detecting and recognizing multi-directional road contexts using a modified LRCN. After balancing the dataset through data augmentation, the number of video files increased, resulting in our model achieving 91% accuracy, a significant improvement from the original dataset. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. A Deep-Learning-Based Method for Spectrum Sensing with Multiple Feature Combination.
- Author
-
Zhang, Yixuan and Luo, Zhongqiang
- Subjects
ARTIFICIAL neural networks ,CONVOLUTIONAL neural networks ,DEEP learning ,COGNITIVE radio ,STATISTICAL learning ,RADIO networks - Abstract
Cognitive radio networks enable the detection and opportunistic access to an idle spectrum through spectrum-sensing technologies, thus providing services to secondary users. However, at a low signal-to-noise ratio (SNR), existing spectrum-sensing methods, such as energy statistics and cyclostationary detection, tend to fail or become overly complex, limiting their sensing accuracy in complex application scenarios. In recent years, the integration of deep learning with wireless communications has shown significant potential. Utilizing neural networks to learn the statistical characteristics of signals can effectively adapt to the changing communication environment. To enhance spectrum-sensing performance under low-SNR conditions, this paper proposes a deep-learning-based spectrum-sensing method that combines multiple signal features, including energy statistics, power spectrum, cyclostationarity, and I/Q components. The proposed method used these combined features to form a specific matrix, which was then efficiently learned and detected through the designed 'SenseNet' network. Experimental results showed that at an SNR of −20 dB, the SenseNet model achieved a 58.8% spectrum-sensing accuracy, which is a 3.3% improvement over the existing convolutional neural network model. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. Reduced-Order Modeling of Steady and Unsteady Flows with Deep Neural Networks †.
- Author
-
Barraza, Bryan and Gross, Andreas
- Subjects
ARTIFICIAL neural networks ,CONVOLUTIONAL neural networks ,MACHINE learning ,UNSTEADY flow ,BOUNDARY layer (Aerodynamics) - Abstract
Large-eddy and direct numerical simulations generate vast data sets that are challenging to interpret, even for simple geometries at low Reynolds numbers. This has increased the importance of automatic methods for extracting significant features to understand physical phenomena. Traditional techniques like the proper orthogonal decomposition (POD) have been widely used for this purpose. However, recent advancements in computational power have allowed for the development of data-driven modal reduction approaches. This paper discusses four applications of deep neural networks for aerodynamic applications, including a convolutional neural network autoencoder, to analyze unsteady flow fields around a circular cylinder at Re = 100 and a supersonic boundary layer with Tollmien–Schlichting waves. The autoencoder results are comparable to those obtained with POD and spectral POD. Additionally, it is demonstrated that the autoencoder can compress steady hypersonic boundary-layer profiles into a low-dimensional vector space that is spanned by the pressure gradient and wall-temperature ratio. This paper also proposes a convolutional neural network model to estimate velocity and temperature profiles across different hypersonic flow conditions. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
19. Fault Diagnosis of Hydropower Units Based on Gramian Angular Summation Field and Parallel CNN.
- Author
-
Li, Xiang, Zhang, Jianbo, Xiao, Boyi, Zeng, Yun, Lv, Shunli, Qian, Jing, and Du, Zhaorui
- Subjects
CONVOLUTIONAL neural networks ,ARTIFICIAL neural networks ,FAULT diagnosis ,SUPPORT vector machines ,FEATURE extraction - Abstract
To enhance the operational efficiency and fault detection accuracy of hydroelectric units, this paper proposes a parallel convolutional neural network model that integrates the Gramian angular summation field (GASF) with an Improved coati optimization algorithm–parallel convolutional neural network (ICOA-PCNN). Additionally, to further improve the model's accuracy in fault identification, a multi-head self-attention mechanism (MSA) and support vector machine (SVM) are introduced for a secondary optimization of the model. Initially, the GASF technique converts one-dimensional time series signals into two-dimensional images, and a COA-CNN dual-branch model is established for feature extraction. To address the issues of uneven population distribution and susceptibility to local optima in the COA algorithm, various optimization strategies are implemented to improve its global search capability. Experimental results indicate that the accuracy of this model reaches 100%, significantly surpassing other nonoptimized models. This research provides a valuable addition to fault diagnosis technology for modern hydroelectric units. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
20. Non-Invasive Prediction of Choledocholithiasis Using 1D Convolutional Neural Networks and Clinical Data.
- Author
-
Mena-Camilo, Enrique, Salazar-Colores, Sebastián, Aceves-Fernández, Marco Antonio, Lozada-Hernández, Edgard Efrén, and Ramos-Arreguín, Juan Manuel
- Subjects
ENDOSCOPIC retrograde cholangiopancreatography ,CONVOLUTIONAL neural networks ,ARTIFICIAL neural networks ,FISHER discriminant analysis ,GALLSTONES ,MACHINE learning - Abstract
This paper introduces a novel one-dimensional convolutional neural network that utilizes clinical data to accurately detect choledocholithiasis, where gallstones obstruct the common bile duct. Swift and precise detection of this condition is critical to preventing severe complications, such as biliary colic, jaundice, and pancreatitis. This cutting-edge model was rigorously compared with other machine learning methods commonly used in similar problems, such as logistic regression, linear discriminant analysis, and a state-of-the-art random forest, using a dataset derived from endoscopic retrograde cholangiopancreatography scans performed at Olive View–University of California, Los Angeles Medical Center. The one-dimensional convolutional neural network model demonstrated exceptional performance, achieving 90.77% accuracy and 92.86% specificity, with an area under the curve of 0.9270. While the paper acknowledges potential areas for improvement, it emphasizes the effectiveness of the one-dimensional convolutional neural network architecture. The results suggest that this one-dimensional convolutional neural network approach could serve as a plausible alternative to endoscopic retrograde cholangiopancreatography, considering its disadvantages, such as the need for specialized equipment and skilled personnel and the risk of postoperative complications. The potential of the one-dimensional convolutional neural network model to significantly advance the clinical diagnosis of this gallstone-related condition is notable, offering a less invasive, potentially safer, and more accessible alternative. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
21. A Bio-Inspired Retinal Model as a Prefiltering Step Applied to Letter and Number Recognition on Chilean Vehicle License Plates.
- Author
-
Kern, John, Urrea, Claudio, Cubillos, Francisco, and Navarrete, Ricardo
- Subjects
AUTOMOBILE license plates ,ARTIFICIAL neural networks ,CONVOLUTIONAL neural networks ,BIOLOGICALLY inspired computing ,OPTICAL character recognition ,PATTERN recognition systems ,ERROR rates - Abstract
This paper presents a novel use of a bio-inspired retina model as a scene preprocessing stage for the recognition of letters and numbers on Chilean vehicle license plates. The goal is to improve the effectiveness and ease of pattern recognition. Inspired by the responses of mammalian retinas, this retinal model reproduces both the natural adjustment of contrast and the enhancement of object contours by parvocellular cells. Among other contributions, this paper provides an in-depth exploration of the architecture, advantages, and limitations of the model; investigates the tuning parameters of the model; and evaluates its performance when integrating a convolutional neural network and a spiking neural network into an optical character recognition (OCR) algorithm, using 40 different genuine license plate images as a case study and for testing. The results obtained demonstrate the reduction of error rates in character recognition based on convolutional neural networks (CNNs), spiking neural networks (SNNs), and OCR. It is concluded that this bio-inspired retina model offers a wide spectrum of potential applications to further explore, including motion detection, pattern recognition, and improvement of dynamic range in images, among others. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. INT-FUP: Intuitionistic Fuzzy Pooling.
- Author
-
Rajafillah, Chaymae, El Moutaouakil, Karim, Patriciu, Alina-Mihaela, Yahyaouy, Ali, and Riffi, Jamal
- Subjects
ARTIFICIAL neural networks ,IMAGE recognition (Computer vision) ,CONVOLUTIONAL neural networks ,FUZZY sets ,SET theory - Abstract
Convolutional Neural Networks (CNNs) are a kind of artificial neural network designed to extract features and find out patterns for tasks such as segmentation, recognizing objects, and drawing up classification. Within a CNNs architecture, pooling operations are used until the number of parameters and the computational complexity are reduced. Numerous papers have focused on investigating the impact of pooling on the performance of Convolutional Neural Networks (CNNs), leading to the development of various pooling models. Recently, a fuzzy pooling operation based on type-1 fuzzy sets was introduced to cope with the local imprecision of the feature maps. However, in fuzzy set theory, it is not always accurate to assume that the degree of non-membership of an element in a fuzzy set is simply the complement of the degree of membership. This is due to the potential existence of a hesitation degree, which implies a certain level of uncertainty. To overcome this limitation, intuitionistic fuzzy sets (IFS) were introduced to incorporate the concept of a degree of hesitation. In this paper, we introduce a novel pooling operation based on intuitionistic fuzzy sets to incorporate the degree of hesitation heretofore neglected by a fuzzy pooling operation based on classical fuzzy sets, and we investigate its performance in the context of image classification. Intuitionistic pooling is performed in four steps: bifuzzification (by the transformation of data through the use of membership and non-membership maps), first aggregation (through the transformation of the IFS into a standard fuzzy set, second aggregation (through the transformation and use of a sum operator), and the defuzzification of feature map neighborhoods by using a max operator. IFS pooling is used for the construction of an intuitionistic pooling layer that can be applied as a drop-in replacement for the current, fuzzy (type-1) and crisp, pooling layers of CNN architectures. Various experiments involving multiple datasets demonstrate that an IFS-based pooling can enhance the classification performance of a CNN. A benchmarking study reveals that this significantly outperforms even the most recent pooling models, especially in stochastic environments. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. An Interference Mitigation Method for FMCW Radar Based on Time–Frequency Distribution and Dual-Domain Fusion Filtering.
- Author
-
Zhou, Yu, Cao, Ronggang, Zhang, Anqi, and Li, Ping
- Subjects
CONVOLUTIONAL neural networks ,ARTIFICIAL neural networks ,RADIO interference ,RADAR interference ,RADAR ,IMAGE reconstruction ,BISTATIC radar - Abstract
Radio frequency interference (RFI) significantly hampers the target detection performance of frequency-modulated continuous-wave radar. To address the problem and maintain the target echo signal, this paper proposes a priori assumption on the interference component nature in the radar received signal, as well as a method for interference estimation and mitigation via time–frequency analysis. The solution employs Fourier synchrosqueezed transform to implement the radar's beat signal transformation from time domain to time–frequency domain, thus converting the interference mitigation to the task of time–frequency distribution image restoration. The solution proposes the use of image processing based on the dual-tree complex wavelet transform and combines it with the spatial domain-based approach, thereby establishing a dual-domain fusion interference filter for time–frequency distribution images. This paper also presents a convolutional neural network model of structurally improved UNet++, which serves as the interference estimator. The proposed solution demonstrated its capability against various forms of RFI through the simulation experiment and showed a superior interference mitigation performance over other CNN model-based approaches. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
24. Deep Time Series Forecasting Models: A Comprehensive Survey.
- Author
-
Liu, Xinhe and Wang, Wenmin
- Subjects
DEEP learning ,ARTIFICIAL neural networks ,TIME series analysis ,CONVOLUTIONAL neural networks ,ARTIFICIAL intelligence ,LANGUAGE models - Abstract
Deep learning, a crucial technique for achieving artificial intelligence (AI), has been successfully applied in many fields. The gradual application of the latest architectures of deep learning in the field of time series forecasting (TSF), such as Transformers, has shown excellent performance and results compared to traditional statistical methods. These applications are widely present in academia and in our daily lives, covering many areas including forecasting electricity consumption in power systems, meteorological rainfall, traffic flow, quantitative trading, risk control in finance, sales operations and price predictions for commercial companies, and pandemic prediction in the medical field. Deep learning-based TSF tasks stand out as one of the most valuable AI scenarios for research, playing an important role in explaining complex real-world phenomena. However, deep learning models still face challenges: they need to deal with the challenge of large-scale data in the information age, achieve longer forecasting ranges, reduce excessively high computational complexity, etc. Therefore, novel methods and more effective solutions are essential. In this paper, we review the latest developments in deep learning for TSF. We begin by introducing the recent development trends in the field of TSF and then propose a new taxonomy from the perspective of deep neural network models, comprehensively covering articles published over the past five years. We also organize commonly used experimental evaluation metrics and datasets. Finally, we point out current issues with the existing solutions and suggest promising future directions in the field of deep learning combined with TSF. This paper is the most comprehensive review related to TSF in recent years and will provide a detailed index for researchers in this field and those who are just starting out. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
25. Deep Learning-Based Design Method for Acoustic Metasurface Dual-Feature Fusion.
- Author
-
Lv, Qiang, Zhao, Huanlong, Huang, Zhen, Hao, Guoqiang, and Chen, Wei
- Subjects
DEEP learning ,CONVOLUTIONAL neural networks ,ARTIFICIAL neural networks ,SOUND design ,ACOUSTIC field ,GENETIC algorithms - Abstract
Existing research in metasurface design was based on trial-and-error high-intensity iterations and requires deep acoustic expertise from the researcher, which severely hampered the development of the metasurface field. Using deep learning enabled the fast and accurate design of hypersurfaces. Based on this, in this paper, an integrated learning approach was first utilized to construct a model of the forward mapping relationship between the hypersurface physical structure parameters and the acoustic field, which was intended to be used for data enhancement. Then a dual-feature fusion model (DFCNN) based on a convolutional neural network was proposed, in which the first feature was the high-dimensional nonlinear features extracted using a data-driven approach, and the second feature was the physical feature information of the acoustic field mined using the model. A convolutional neural network was used for feature fusion. A genetic algorithm was used for network parameter optimization. Finally, generalization ability verification was performed to prove the validity of the network model. The results showed that 90% of the integrated learning models had an error of less than 3 dB between the real and predicted sound field data, and 93% of the DFCNN models could achieve an error of less than 5 dB in the local sound field intensity. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
26. Mapping Method of Human Arm Motion Based on Surface Electromyography Signals.
- Author
-
Zheng, Yuanyuan, Zheng, Gang, Zhang, Hanqi, Zhao, Bochen, and Sun, Peng
- Subjects
ARTIFICIAL neural networks ,MACHINE learning ,CONVOLUTIONAL neural networks ,DEEP learning ,SENSOR placement ,ARM ,FINGER joint - Abstract
This paper investigates a method for precise mapping of human arm movements using sEMG signals. A multi-channel approach captures the sEMG signals, which, combined with the accurately calculated joint angles from an Inertial Measurement Unit, allows for action recognition and mapping through deep learning algorithms. Firstly, signal acquisition and processing were carried out, which involved acquiring data from various movements (hand gestures, single-degree-of-freedom joint movements, and continuous joint actions) and sensor placement. Then, interference signals were filtered out through filters, and the signals were preprocessed using normalization and moving averages to obtain sEMG signals with obvious features. Additionally, this paper constructs a hybrid network model, combining Convolutional Neural Networks and Artificial Neural Networks, and employs a multi-feature fusion algorithm to enhance the accuracy of gesture recognition. Furthermore, a nonlinear fitting between sEMG signals and joint angles was established based on a backpropagation neural network, incorporating momentum term and adaptive learning rate adjustments. Finally, based on the gesture recognition and joint angle prediction model, prosthetic arm control experiments were conducted, achieving highly accurate arm movement prediction and execution. This paper not only validates the potential application of sEMG signals in the precise control of robotic arms but also lays a solid foundation for the development of more intuitive and responsive prostheses and assistive devices. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
27. Fault Diagnosis Strategy Based on BOA-ResNet18 Method for Motor Bearing Signals with Simulated Hydrogen Refueling Station Operating Noise.
- Author
-
Liu, Shuyi, Chen, Shengtao, Chen, Zuzhi, and Gong, Yongjun
- Subjects
FUELING ,FAULT diagnosis ,CONVOLUTIONAL neural networks ,ARTIFICIAL neural networks ,OPTIMIZATION algorithms ,FAST Fourier transforms - Abstract
Featured Application: In this paper, the vibration data of CWRU motor bearings is used as the original data to simulate the real working conditions of motor bearings in the working environment of hydrogen station by simulating the noise signal of hydrogen station. Then, the BOA-ResNet18 model is used to diagnose the faults of the added noise signals, and the accuracy of the fault diagnosis is very high, and the experimental results show that this model can be applied in the fault diagnosis of hydrogen station equipment under the working conditions of hydrogen station. The harsh working environment of hydrogen refueling stations often causes equipment failure and is vulnerable to mechanical noise during monitoring. This limits the accuracy of equipment monitoring, ultimately decreasing efficiency. To address this issue, this paper presents a motor bearing vibration signal diagnosis method that employs a Bayesian optimization (BOA) residual neural network (ResNet). The industrial noise signal of the hydrogenation station is simulated and then combined with the motor bearing signal. The resulting one-dimensional bearing signal is processed and transformed into a two-dimensional signal using Fast Fourier Transform (FFT). Afterwards, the signal is segmented using the sliding window translation method to enhance the data volume. After comparing signal feature extraction and classification results from various convolutional neural network models, ResNet18 yields the best classification accuracy, achieving a training accuracy of 89.50% with the shortest computation time. Afterwards, the hyperparameters of ResNet18 such as InitialLearnRate, Momentum, and L2Regularization Parameter are optimized using the Bayesian optimization algorithm. The experiment findings demonstrate a diagnostic accuracy of 99.31% for the original signal model, while the accuracy for the bearing signal, with simulated industrial noise from the hydrogenation station, can reach over 92%. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
28. A Deep Neural Network-Based Optimal Scheduling Decision-Making Method for Microgrids.
- Author
-
Chen, Fei, Wang, Zhiyang, and He, Yu
- Subjects
MICROGRIDS ,CONVOLUTIONAL neural networks ,ARTIFICIAL neural networks ,MIXED integer linear programming ,MATHEMATICAL optimization ,DECISION making - Abstract
With the rapid growth in the proportion of renewable energy access and the structural complexity of distributed energy systems, traditional microgrid (MG) scheduling methods that rely on mathematical optimization models and expert experience are facing significant challenges. Therefore, it is essential to present a novel scheduling technique with high intelligence and fast decision-making capacity to realize MGs' automatic operation and regulation. This paper proposes an optimal scheduling decision-making method for MGs based on deep neural networks (DNN). Firstly, a typical mathematical scheduling model used for MG operation is introduced, and the limitations of current methods are analyzed. Then, a two-stage optimal scheduling framework comprising day-ahead and intra-day stages is presented. The day-ahead part is solved by mixed integer linear programming (MILP), and the intra-day part uses a convolutional neural network (CNN)—bidirectional long short-term memory (Bi LSTM) for high-speed rolling decision making, with the outputs adjusted by a power correction balance algorithm. Finally, the validity of the model and algorithm of this paper are verified by arithmetic case analysis. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
29. Machine Learning Applications in Surface Transportation Systems: A Literature Review.
- Author
-
Behrooz, Hojat and Hayeri, Yeganeh M.
- Subjects
ARTIFICIAL neural networks ,MACHINE learning ,INTELLIGENT transportation systems ,LITERATURE reviews ,CONVOLUTIONAL neural networks ,DEEP learning ,SUPPORT vector machines - Abstract
Surface transportation has evolved through technology advancements using parallel knowledge areas such as machine learning (ML). However, the transportation industry has not yet taken full advantage of ML. To evaluate this gap, we utilized a literature review approach to locate, categorize, and synthesize the principal concepts of research papers regarding surface transportation systems using ML algorithms, and we then decomposed them into their fundamental elements. We explored more than 100 articles, literature review papers, and books. The results show that 74% of the papers concentrate on forecasting, while multilayer perceptions, long short-term memory, random forest, supporting vector machine, XGBoost, and deep convolutional neural networks are the most preferred ML algorithms. However, sophisticated ML algorithms have been minimally used. The root-cause analysis revealed a lack of effective collaboration between the ML and transportation experts, resulting in the most accessible transportation applications being used as a case study to test or enhance a given ML algorithm and not necessarily to enhance a mobility or safety issue. Additionally, the transportation community does not define transportation issues clearly and does not provide publicly available transportation datasets. The transportation sector must offer an open-source platform to showcase the sector's concerns and build spatiotemporal datasets for ML experts to accelerate technology advancements. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
30. RISOPA: Rapid Imperceptible Strong One-Pixel Attacks in Deep Neural Networks.
- Author
-
Nam, Wonhong, Kim, Kunha, Moon, Hyunwoo, Noh, Hyeongmin, Park, Jiyeon, and Kil, Hyunyoung
- Subjects
ARTIFICIAL neural networks ,DIFFERENTIAL evolution ,CONVOLUTIONAL neural networks ,RANDOM walks ,MACHINE learning - Abstract
Recent research has revealed that subtle imperceptible perturbations can deceive well-trained neural network models, leading to inaccurate outcomes. These instances, known as adversarial examples, pose significant threats to the secure application of machine learning techniques in safety-critical systems. In this paper, we delve into the study of one-pixel attacks in deep neural networks, recently reported as a kind of adversarial examples. To identify such one-pixel attacks, most existing methodologies rely on the differential evolution method, which utilizes random selection from the current population to escape local optima. However, the differential evolution technique might waste search time and overlook good solutions if the number of iterations is insufficient. Hence, in this paper, we propose a gradient ascent with momentum approach to efficiently discover good solutions for the one-pixel attack problem. As our method takes a more direct route to the goal compared to existing methods relying on blind random walks, it can effectively identify one-pixel attacks. Our experiments conducted on popular CNNs demonstrate that, in comparison with existing methodologies, our technique can detect one-pixel attacks significantly faster. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
31. Advances in Facial Expression Recognition: A Survey of Methods, Benchmarks, Models, and Datasets.
- Author
-
Kopalidis, Thomas, Solachidis, Vassilios, Vretos, Nicholas, and Daras, Petros
- Subjects
DEEP learning ,FACIAL expression ,ARTIFICIAL neural networks ,COMPUTER vision ,CONVOLUTIONAL neural networks ,FEATURE extraction - Abstract
Recent technological developments have enabled computers to identify and categorize facial expressions to determine a person's emotional state in an image or a video. This process, called "Facial Expression Recognition (FER)", has become one of the most popular research areas in computer vision. In recent times, deep FER systems have primarily concentrated on addressing two significant challenges: the problem of overfitting due to limited training data availability, and the presence of expression-unrelated variations, including illumination, head pose, image resolution, and identity bias. In this paper, a comprehensive survey is provided on deep FER, encompassing algorithms and datasets that offer insights into these intrinsic problems. Initially, this paper presents a detailed timeline showcasing the evolution of methods and datasets in deep facial expression recognition (FER). This timeline illustrates the progression and development of the techniques and data resources used in FER. Then, a comprehensive review of FER methods is introduced, including the basic principles of FER (components such as preprocessing, feature extraction and classification, and methods, etc.) from the pro-deep learning era (traditional methods using handcrafted features, i.e., SVM and HOG, etc.) to the deep learning era. Moreover, a brief introduction is provided related to the benchmark datasets (there are two categories: controlled environments (lab) and uncontrolled environments (in the wild)) used to evaluate different FER methods and a comparison of different FER models. Existing deep neural networks and related training strategies designed for FER, based on static images and dynamic image sequences, are discussed. The remaining challenges and corresponding opportunities in FER and the future directions for designing robust deep FER systems are also pinpointed. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
32. An Unsupervised Character Recognition Method for Tibetan Historical Document Images Based on Deep Learning.
- Author
-
Wang, Xiaojuan and Wang, Weilan
- Subjects
DEEP learning ,PATTERN recognition systems ,ARTIFICIAL neural networks ,HISTORICAL source material ,CONVOLUTIONAL neural networks ,TIBETANS - Abstract
As there is a lack of public mark samples of Tibetan historical document image characters at present, this paper proposes an unsupervised Tibetan historical document character recognition method based on deep learning (UD-CNN). Firstly, using the Tibetan historical document character component, the Tibetan historical document character sample data set is constructed for model-aided training. Then, the character baseline information is introduced, and a fine-grained feature learning strategy is proposed. For the samples above and below the baseline, the Up-CNN recognition model and Down-CNN recognition model are established. The convolution neural network model is trained and optimized for the samples above and below the baseline, respectively, to improve the recognition accuracy. The experimental results show that the proposed method obviously affects the unmarked character classification and recognition of real Tibetan historical document images. The recognition rate of Top5 can reach 92.94%, and the recognition rate of Top1 can be increased from 82.25% to 87.27% using the CNN model only. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
33. Model-Based 3D Gaze Estimation Using a TOF Camera.
- Author
-
Shen, Kuanxin, Li, Yingshun, Guo, Zhannan, Gao, Jintao, and Wu, Yingjian
- Subjects
GAZE ,ARTIFICIAL neural networks ,STANDARD deviations ,CONVOLUTIONAL neural networks ,INFRARED imaging ,FACE ,RETINAL blood vessels - Abstract
Among the numerous gaze-estimation methods currently available, appearance-based methods predominantly use RGB images as input and employ convolutional neural networks (CNNs) to detect facial images to regressively obtain gaze angles or gaze points. Model-based methods require high-resolution images to obtain a clear eyeball geometric model. These methods face significant challenges in outdoor environments and practical application scenarios. This paper proposes a model-based gaze-estimation algorithm using a low-resolution 3D TOF camera. This study uses infrared images instead of RGB images as input to overcome the impact of varying illumination intensity in the environment on gaze estimation. We utilized a trained YOLOv8 neural network model to detect eye landmarks in captured facial images. Combined with the depth map from a time-of-flight (TOF) camera, we calculated the 3D coordinates of the canthus points of a single eye of the subject. Based on this, we fitted a 3D geometric model of the eyeball to determine the subject's gaze angle. Experimental validation showed that our method achieved a root mean square error of 6.03° and 4.83° in the horizontal and vertical directions, respectively, for the detection of the subject's gaze angle. We also tested the proposed method in a real car driving environment, achieving stable driver gaze detection at various locations inside the car, such as the dashboard, driver mirror, and the in-vehicle screen. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
34. Object Detection for Industrial Applications: Training Strategies for AI-Based Depalletizer.
- Author
-
Buongiorno, Domenico, Caramia, Donato, Di Ruscio, Luca, Longo, Nicola, Panicucci, Simone, Di Stefano, Giovanni, Bevilacqua, Vitoantonio, and Brunetti, Antonio
- Subjects
OBJECT recognition (Computer vision) ,ARTIFICIAL neural networks ,ARTIFICIAL intelligence ,STOCK-keeping unit ,CONVOLUTIONAL neural networks ,STEREO vision (Computer science) - Abstract
In the last 10 years, the demand for robot-based depalletization systems has constantly increased due to the growth of sectors such as logistics, storage, and supply chains. Since the scenarios are becoming more and more unstructured, characterized by unknown pallet layouts and stock-keeping unit shapes, the classical depalletization systems based on the knowledge of predefined positions within the pallet frame are going to be substituted by innovative and robust solutions based on 2D/3D vision and Deep Learning (DL) methods. In particular, the Convolutional Neural Networks (CNNs) are deep networks that have proven to be effective in processing 2D/3D images, for example in the automatic object detection task, and robust to the possible variability among the data. However, deep neural networks need a big amount of data to be trained. In this context, whenever deep networks are involved in object detection for supporting depalletization systems, the dataset collection represents one of the main bottlenecks during the commissioning phase. The present work aims at comparing different training strategies to customize an object detection model aiming at minimizing the number of images required for model fitting, while ensuring reliable and robust performances. Different approaches based on a CNN for object detection are proposed, evaluated, and compared in terms of the F1-score. The study was conducted considering different starting conditions in terms of the neural network's weights, the datasets, and the training set sizes. The proposed approaches were evaluated on the detection of different kinds of paper boxes placed on an industrial pallet. The outcome of the work validates that the best strategy is based on fine-tuning of a CNN-based model already trained on the detection of paper boxes, with a median F1-score greater than 85.0 % . [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
35. GNSS-IR Soil Moisture Retrieval Using Multi-Satellite Data Fusion Based on Random Forest.
- Author
-
Jiang, Yao, Zhang, Rui, Sun, Bo, Wang, Tianyu, Zhang, Bo, Tu, Jinsheng, Nie, Shihai, Jiang, Hang, and Chen, Kangyi
- Subjects
ARTIFICIAL neural networks ,CONVOLUTIONAL neural networks ,GLOBAL Positioning System ,STANDARD deviations ,RADIAL basis functions - Abstract
The accuracy and reliability of soil moisture retrieval based on Global Positioning System (GPS) single-star Signal-to-Noise Ratio (SNR) data is low due to the influence of spatial and temporal differences of different satellites. Therefore, this paper proposes a Random Forest (RF)-based multi-satellite data fusion Global Navigation Satellite System Interferometric Reflectometry (GNSS-IR) soil moisture retrieval method, which utilizes the RF Model's Mean Decrease Impurity (MDI) algorithm to adaptively assign arc weights to fuse all available satellite data to obtain accurate retrieval results. Subsequently, the effectiveness of the proposed method was validated using GPS data from the Plate Boundary Observatory (PBO) network sites P041 and P037, as well as data collected in Lamasquere, France. A Support Vector Machine model (SVM), Radial Basis Function (RBF) neural network model, and Convolutional Neural Network model (CNN) are introduced for the comparison of accuracy. The results indicated that the proposed method had the best retrieval performance, with Root Mean Square Error (RMSE) values of 0.032, 0.028, and 0.003 cm
3 /cm3 , Mean Absolute Error (MAE) values of 0.025, 0.022, and 0.002 cm3 /cm3 , and correlation coefficients (R) of 0.94, 0.95, and 0.98, respectively, at the three sites. Therefore, the proposed soil moisture retrieval model demonstrates strong robustness and generalization capabilities, providing a reference for achieving high-precision, real-time monitoring of soil moisture. [ABSTRACT FROM AUTHOR]- Published
- 2024
- Full Text
- View/download PDF
36. Underwater Acoustic Orthogonal Frequency-Division Multiplexing Communication Using Deep Neural Network-Based Receiver: River Trial Results.
- Author
-
Thenginthody Hassan, Sabna, Chen, Peng, Rong, Yue, and Chan, Kit Yan
- Subjects
ARTIFICIAL neural networks ,CONVOLUTIONAL neural networks ,UNDERWATER acoustic communication ,DOPPLER effect ,MULTIPATH channels ,CHANNEL estimation - Abstract
In this article, a deep neural network (DNN)-based underwater acoustic (UA) communication receiver is proposed. Conventional orthogonal frequency-division multiplexing (OFDM) receivers perform channel estimation using linear interpolation. However, due to the significant delay spread in multipath UA channels, the frequency response often exhibits strong non-linearity between pilot subcarriers. Since the channel delay profile is generally unknown, this non-linearity cannot be modeled precisely. A neural network (NN)-based receiver effectively tackles this challenge by learning and compensating for the non-linearity through NN training. The performance of the DNN-based UA communication receiver was tested recently in river trials in Western Australia. The results obtained from the trials prove that the DNN-based receiver performs better than the conventional least-squares (LS) estimator-based receiver. This paper suggests that UA communication using DNN receivers holds great potential for revolutionizing underwater communication systems, enabling higher data rates, improved reliability, and enhanced adaptability to changing underwater conditions. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
37. Lightweight Network Bearing Intelligent Fault Diagnosis Based on VMD-FK-ShuffleNetV2.
- Author
-
Jiang, Wanlu, Qi, Zhiqian, Jiang, Anqi, Chang, Shangteng, and Xia, Xudong
- Subjects
CONVOLUTIONAL neural networks ,ARTIFICIAL neural networks ,FAULT diagnosis ,ROLLER bearings ,INTELLIGENT networks - Abstract
With the increasing complexity of mechanical equipment and diversification of deep learning models, vibration signals collected from such equipment are susceptible to noise interference. Moreover, traditional neural network models struggle to be effectively deployed in production environments with limited computational resources, severely impacting the accurate extraction and effective diagnosis of FK fault characteristics. In response to this challenge, this study proposes a fault diagnosis method for rolling bearings, integrating a lightweight ShuffleNetV2 network with variational mode decomposition (VMD) and the fast kurtogram (FK) algorithm. Initially, this paper introduces an enhanced FK method where the VMD algorithm is employed for data denoising, extracting FK post-denoising. These feature maps not only preserve critical signal information but also simplify data complexity. Subsequently, these feature maps are utilized to train and test the ShuffleNetV2 model, facilitating effective fault identification and classification. Ultimately, by conducting experimental comparisons with several mainstream lightweight network models, such as MobileNet and SqueezeNet, as well as traditional convolutional neural network models, this study validates the effectiveness of the proposed method in extracting fault characteristics from vibration signals, demonstrating superior diagnostic accuracy and computational efficiency. This provides a novel technical approach for health monitoring and fault diagnosis of industrial bearings and offers theoretical and experimental support for the deployment of lightweight networks in industrial applications. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
38. An Ensemble Deep Neural Network-Based Method for Person Identification Using Electrocardiogram Signals Acquired on Different Days.
- Author
-
Byeon, Yeong-Hyeon and Kwak, Keun-Chang
- Subjects
ARTIFICIAL neural networks ,CONVOLUTIONAL neural networks ,SENSOR placement ,DATA augmentation ,HEART beat - Abstract
Electrocardiogram (ECG) signals are a measure minute electrical signals generated during the cardiac cycle, a biometric signal that occurs during vital human activity. ECG signals are susceptible to various types of noise depending on the data acquisition conditions, with factors such as sensor placement and the physiological and mental states of the subject contributing to the diverse shapes of these signals. When the data are acquired in a single session, the environmental variables are relatively similar, resulting in similar ECG signals; however, in subsequent sessions, even for the same person, changes in the environmental variables can alter the signal shape. This phenomenon poses challenges for person identification using ECG signals acquired on different days. To improve the performance of individual identification, even when ECG data is acquired on different days, this paper proposes an ensemble deep neural network for person identification by comparing and analyzing the ECG recognition performance under various conditions. The proposed ensemble deep neural network comprises three streams that incorporate two well-known pretrained models. Each network receives the time-frequency representation of ECG signals as input, and a stream reuses the same network structure under different learning conditions with or without data augmentation. The proposed ensemble deep neural network was validated on the Physikalisch-Technische Bundesanstalt dataset, and the results confirmed a 3.39% improvement in accuracy compared to existing methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
39. Short-Term Campus Load Forecasting Using CNN-Based Encoder–Decoder Network with Attention.
- Author
-
Ahmed, Zain, Jamil, Mohsin, and Khan, Ashraf Ali
- Subjects
ARTIFICIAL neural networks ,CONVOLUTIONAL neural networks ,MACHINE learning ,LONG-term memory ,RECURRENT neural networks ,DEEP learning - Abstract
Short-term load forecasting is a challenging research problem and has a tremendous impact on electricity generation, transmission, and distribution. A robust forecasting algorithm can help power system operators to better tackle the ever-changing electric power demand. This paper presents a novel deep neural network for short-term electric load forecasting for the St. John's campus of Memorial University of Newfoundland (MUN). The electric load data are obtained from the Memorial University of Newfoundland and combined with metrological data from St. John's. This dataset is used to formulate a multivariate time-series forecasting problem. A novel deep learning algorithm is presented, consisting of a 1D Convolutional Neural Network, which is followed by an encoder–decoder-based network with attention. The input used for this model is the electric load consumption and metrological data, while the output is the hourly prediction of the next day. The model is compared with Gated Recurrent Unit (GRU) and Long Short Term Memory (LSTM)-based Recurrent Neural Network. A CNN-based encoder–decoder model without attention is also tested. The proposed model shows a lower mean absolute error (MAE), mean squared error (MSE), mean absolute percentage error (MAPE), and higher R
2 score. These evaluation metrics show an improved performance compared to GRU and LSTM-based RNNs as well as the CNN encoder–decoder model without attention. [ABSTRACT FROM AUTHOR]- Published
- 2024
- Full Text
- View/download PDF
40. 1D-CNN-Transformer for Radar Emitter Identification and Implemented on FPGA.
- Author
-
Gao, Xiangang, Wu, Bin, Li, Peng, and Jing, Zehuan
- Subjects
ARTIFICIAL neural networks ,MACHINE learning ,FIELD programmable gate arrays ,CONVOLUTIONAL neural networks ,ENERGY consumption - Abstract
Deep learning has brought great development to radar emitter identification technology. In addition, specific emitter identification (SEI), as a branch of radar emitter identification, has also benefited from it. However, the complexity of most deep learning algorithms makes it difficult to adapt to the requirements of the low power consumption and high-performance processing of SEI on embedded devices, so this article proposes solutions from the aspects of software and hardware. From the software side, we design a Transformer variant network, lightweight convolutional Transformer (LW-CT) that supports parameter sharing. Then, we cascade convolutional neural networks (CNNs) and the LW-CT to construct a one-dimensional-CNN-Transformer(1D-CNN-Transformer) lightweight neural network model that can capture the long-range dependencies of radar emitter signals and extract signal spatial domain features meanwhile. In terms of hardware, we design a low-power neural network accelerator based on an FPGA to complete the real-time recognition of radar emitter signals. The accelerator not only designs high-efficiency computing engines for the network, but also devises a reconfigurable buffer called "Ping-pong CBUF" and two-level pipeline architecture for the convolution layer for alleviating the bottleneck caused by the off-chip storage access bandwidth. Experimental results show that the algorithm can achieve a high recognition performance of SEI with a low calculation overhead. In addition, the hardware acceleration platform not only perfectly meets the requirements of the radar emitter recognition system for low power consumption and high-performance processing, but also outperforms the accelerators in other papers in terms of the energy efficiency ratio of Transformer layer processing. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
41. Refined Land Use Classification for Urban Core Area from Remote Sensing Imagery by the EfficientNetV2 Model.
- Author
-
Wang, Zhenbao, Liang, Yuqi, He, Yanfang, Cui, Yidan, and Zhang, Xiaoxian
- Subjects
ARTIFICIAL neural networks ,CONVOLUTIONAL neural networks ,ZONING ,URBAN land use ,IMAGE recognition (Computer vision) ,DEEP learning - Abstract
In the context of accelerated urbanization, assessing the quality of the existing built environment plays a crucial role in urban renewal. In the existing research and use of deep learning models, most categories are urban construction areas, forest land, farmland, and other categories. These categories are not conducive to a more accurate analysis of the spatial distribution characteristics of urban green space, parking space, blue space, and square. A small sample of refined land use classification data for urban built-up areas was produced using remote sensing images. The large-scale remote sensing images were classified using deep learning models, with the objective of inferring the fine land category of each tile image. In this study, satellite remote sensing images of four cities, Handan, Shijiazhuang, Xingtai, and Tangshan, were acquired by Google Class 19 RGB three-channel satellite remote sensing images to establish a data set containing fourteen urban land use classifications. The convolutional neural network model EfficientNetV2 is used to train and validate the network framework that performs well on computer vision tasks and enables intelligent image classification of urban remote sensing images. The model classification effect is compared and analyzed through accuracy, precision, recall, and F1-score. The results show that the EfficientNetV2 model has a classification recognition accuracy of 84.56% on the constructed data set. The testing set accuracy increases sequentially after transfer learning. This paper verifies that the proposed research framework has good practicality and that the results of the land use classification are conducive to the fine-grained quantitative analysis of built-up environmental quality. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
42. Comparison of the Accuracy of Ground Reaction Force Component Estimation between Supervised Machine Learning and Deep Learning Methods Using Pressure Insoles.
- Author
-
Kammoun, Amal, Ravier, Philippe, and Buttelli, Olivier
- Subjects
ARTIFICIAL neural networks ,SUPERVISED learning ,CONVOLUTIONAL neural networks ,GROUND reaction forces (Biomechanics) ,DEEP learning - Abstract
The three Ground Reaction Force (GRF) components can be estimated using pressure insole sensors. In this paper, we compare the accuracy of estimating GRF components for both feet using six methods: three Deep Learning (DL) methods (Artificial Neural Network, Long Short-Term Memory, and Convolutional Neural Network) and three Supervised Machine Learning (SML) methods (Least Squares, Support Vector Regression, and Random Forest (RF)). Data were collected from nine subjects across six activities: normal and slow walking, static with and without carrying a load, and two Manual Material Handling activities. This study has two main contributions: first, the estimation of GRF components (Fx, Fy, and Fz) during the six activities, two of which have never been studied; second, the comparison of the accuracy of GRF component estimation between the six methods for each activity. RF provided the most accurate estimation for static situations, with mean RMSE values of RMSE_Fx = 1.65 N, RMSE_Fy = 1.35 N, and RMSE_Fz = 7.97 N for the mean absolute values measured by the force plate (reference) RMSE_Fx = 14.10 N, RMSE_Fy = 3.83 N, and RMSE_Fz = 397.45 N. In our study, we found that RF, an SML method, surpassed the experimented DL methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
43. Influence Maximization Based on Adaptive Graph Convolution Neural Network in Social Networks.
- Author
-
Liu, Wei, Wang, Saiwei, and Ding, Jiayi
- Subjects
ARTIFICIAL neural networks ,CONVOLUTIONAL neural networks ,SOCIAL networks ,INFORMATION networks ,ALGORITHMS ,DEEP learning - Abstract
The influence maximization problem is a hot issue in the research on social networks due to its wide application. The problem aims to find a small subset of influential nodes to maximize the influence spread. To tackle the challenge of striking a balance between efficiency and effectiveness in traditional influence maximization algorithms, deep learning-based influence maximization algorithms have been introduced and have achieved advancement. However, these algorithms still encounter two key problems: (1) Traditional deep learning models are not well-equipped to capture the latent topological information of networks with varying sizes and structures. (2) Many deep learning-based methods use the influence spread of individual nodes as labels to train a model, which can result in an overlap of influence among the seed nodes selected by the model. In this paper, we reframe the influence maximization problem as a regression task and introduce an innovative approach to influence maximization. The method adopts an adaptive graph convolution neural network which can explore the latent topology information of the network and can greatly improve the performance of the algorithm. In our approach, firstly, we integrate several network-level attributes and some centrality metrics into a vector as the presentation vector of nodes in the social network. Next, we propose a new label generation method to measure the influence of nodes by neighborhood discount strategy, which takes full account of the influence overlapping problem. Subsequently, labels and presentation vectors are fed into an adaptive graph convolution neural network model. Finally, we use the well-trained model to predict the importance of nodes and select top-K nodes as a seed set. Abundant experiments conducted on various real-world datasets have confirmed that the performance of our proposed algorithm surpasses that of several current state-of-the-art algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
44. Comparative Analysis of Machine-Learning Models for Soil Moisture Estimation Using High-Resolution Remote-Sensing Data.
- Author
-
Li, Ming and Yan, Yueguan
- Subjects
ARTIFICIAL neural networks ,CONVOLUTIONAL neural networks ,OPTICAL remote sensing ,REMOTE sensing ,SYNTHETIC aperture radar - Abstract
Soil moisture is an important component of the hydrologic cycle and ecosystem functioning, and it has a significant impact on agricultural production, climate change and natural disasters. Despite the availability of machine-learning techniques for estimating soil moisture from high-resolution remote-sensing imagery, including synthetic aperture radar (SAR) data and optical remote sensing, comprehensive comparative studies of these techniques remain limited. This paper addresses this gap by systematically comparing the performance of four tree-based ensemble-learning models (random forest (RF), extreme gradient boosting (XGBoost), light gradient-boosting machine (LightGBM), and category boosting (CatBoost)) and three deep-learning models (deep neural network (DNN), convolutional neural network (CNN), and gated recurrent unit (GRU)) in terms of soil moisture estimation. Additionally, we introduce and evaluate the effectiveness of four different stacking methods for model fusion, an approach that is relatively novel in this context. Moreover, Sentinel-1 C-band dual-polarization SAR and Sentinel-2 multispectral data, as well as NASADEM and geographical code and temporal code features, are used as input variables to retrieve the soil moisture in the ShanDian River Basin in China. Our findings reveal that the tree-based ensemble-learning models outperform the deep-learning models, with LightGBM being the best individual model, while the stacking approach can further enhance the accuracy and robustness of soil moisture estimation. Moreover, the stacking all boosting classes ensemble-learning model (SABM), which integrates only boosting-type models, demonstrates superior accuracy and robustness in soil moisture estimation. The SHAP value analysis reveals that ensemble learning can utilize more complex features than deep learning. This study provides an effective method for retrieving soil moisture using machine-learning and high-resolution remote-sensing data, demonstrating the application value of SAR data and high-resolution optical remote-sensing data in soil moisture monitoring. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
45. An Interpretable Breast Ultrasound Image Classification Algorithm Based on Convolutional Neural Network and Transformer.
- Author
-
Meng, Xiangjia, Ma, Jun, Liu, Feng, Chen, Zhihua, and Zhang, Tingting
- Subjects
CONVOLUTIONAL neural networks ,ARTIFICIAL neural networks ,TRANSFORMER models ,COMPUTER-aided diagnosis ,IMAGE recognition (Computer vision) ,BREAST - Abstract
Breast cancer is one of the most common causes of death in women. Early signs of breast cancer can be an abnormality depicted on breast images like breast ultrasonography. Unfortunately, ultrasound images contain a lot of noise, which greatly increases the difficulty for doctors to interpret them. In recent years, computer-aided diagnosis (CAD) has been widely used in medical images, reducing the workload of doctors and the probability of misdiagnosis. However, it still faces the following challenges in clinical practice: one is the lack of interpretability, and another is that the accuracy is not high enough. In this paper, we propose a classification model of breast ultrasound images that leverages tumor boundaries as prior knowledge and strengthens the model to guide classification. Furthermore, we employ the advantages of convolutional neural network (CNN) to extract local features and Transformer to extract global features to achieve information balance and complementarity between the two neural network models which increase the recognition performance of the model. Additionally, an explanation method is used to generate visual results, thereby improving the poor interpretability of deep learning models. Finally, we evaluate the model on the BUSI dataset and compare it with other CNN and Transformer models. Experimental results show that the proposed model obtains an accuracy of 0.9870 and an F1 score of 0.9872, achieving state-of-the-art performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
46. Smartphone Contact Imaging and 1-D CNN for Leaf Chlorophyll Estimation in Agriculture.
- Author
-
Barman, Utpal and Saikia, Manob Jyoti
- Subjects
CONVOLUTIONAL neural networks ,ARTIFICIAL neural networks ,STANDARD deviations ,SUSTAINABLE agriculture ,K-nearest neighbor classification - Abstract
Traditional leaf chlorophyll estimation using Soil Plant Analysis Development (SPAD) devices and spectrophotometers is a high-cost mechanism in agriculture. Recently, research on chlorophyll estimation using leaf camera images and machine learning has been seen. However, these techniques use self-defined image color combinations where the system performance varies, and the potential utility has not been well explored. This paper proposes a new method that combines an improved contact imaging technique, the images' original color parameters, and a 1-D Convolutional Neural Network (CNN) specifically for tea leaves' chlorophyll estimation. This method utilizes a smartphone and flashlight to capture tea leaf contact images at multiple locations on the front and backside of the leaves. It extracts 12 different original color features, such as the mean of RGB, the standard deviation of RGB and HSV, kurtosis, skewness, and variance from images for 1-D CNN input. We captured 15,000 contact images of tea leaves, collected from different tea gardens across Assam, India to create a dataset. SPAD chlorophyll measurements of the leaves are included as true values. Other models based on Linear Regression (LR), Artificial Neural Networks (ANN), Support Vector Regression (SVR), and K-Nearest Neighbor (KNN) were also trained, evaluated, and tested. The 1-D CNN outperformed them with a Mean Absolute Error (MAE) of 2.96, Mean Square Error (MSE) of 15.4, Root Mean Square Error (RMSE) of 3.92, and Coefficient of Regression ( R 2 ) of 0.82. These results show that the method is a digital replication of the traditional method, while also being non-destructive, affordable, less prone to performance variations, and simple to utilize for sustainable agriculture. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
47. Deep Learning Methods to Analyze the Forces and Torques in Joints Motion.
- Author
-
Guo, Rui, Chen, Baoyi, and Li, Yonghui
- Subjects
ARTIFICIAL neural networks ,CONVOLUTIONAL neural networks ,MECHANICS (Physics) ,HUMAN mechanics ,KNEE joint ,KNEE - Abstract
This paper proposes a composite model that combines convolutional neural network models and mechanical analysis to determine the forces acting on an object. First, we establish a model using Newtonian mechanics to analyze the forces experienced by the human body during movement, particularly the forces on joints. The model calculates the mapping relationship between the object's movement and the forces on the joints. Then, by analyzing a large number of fencing competition videos using a deep learning model, we extract video features to study the torques and forces on human joints. Our analysis of numerous images reveals that, in certain movement patterns, the peak pressure on the knee joint can be two to three times higher than in a normal state, while the driving knee can withstand peak torques of 400–600 Nm. This straightforward model can effectively capture the forces and torques on the human body during movement using a deep neural network. Furthermore, this model can also be applied to problems involving non-rigid body motion. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
48. Power Equipment Fault Diagnosis Method Based on Energy Spectrogram and Deep Learning.
- Author
-
Liu, Yiyang, Li, Fei, Guan, Qingbo, Zhao, Yang, and Yan, Shuaihua
- Subjects
DEEP learning ,FAULT diagnosis ,ARTIFICIAL neural networks ,CONVOLUTIONAL neural networks ,DIAGNOSIS methods ,ROLLER bearings - Abstract
With the development of industrial manufacturing intelligence, the role of rotating machinery in industrial production and life is more and more important. Aiming at the problems of the complex and changeable working environment of rolling bearings and limited computing ability, fault feature information cannot be effectively extracted, and the current deep learning model is difficult to be compatible with lightweight and high efficiency. Therefore, this paper proposes a fault detection method for power equipment based on an energy spectrum diagram and deep learning. Firstly, a novel two-dimensional time-frequency feature representation method and energy spectrum feature map based on wavelet packet transform is proposed, and an energy spectrum feature map dataset is made for subsequent diagnosis. This method can realize multi-resolution analysis, fully extract the feature information contained in the fault signal, and accelerate the convergence of the subsequent diagnosis model. Secondly, a lightweight residual dense convolutional neural network model (LR-DenseNet) is proposed. This model combines the advantages of residual learning and a dense connection, and can not only extract deep features more easily, but can also effectively use shallow features. Then, based on the lightweight residual dense convolutional neural network model, an LR-DenseSENet model is proposed. By introducing the transfer learning strategy and adding the channel domain, an attention mechanism is added to the channel feature fusion layer, with the accuracy of detection up to 99.4%, and the amount of parameter calculation greatly reduced to one-fifth of that of VGG. Finally, through an experimental analysis, it is verified that the fault detection model designed in this paper based on the combination of an energy spectrum feature map and LR-DenseSENet achieves a satisfactory detection effect. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
49. Calligraphy Character Detection Based on Deep Convolutional Neural Network.
- Author
-
Peng, Xianlin, Kang, Jian, Wu, Yinjie, and Feng, Xiaoyi
- Subjects
ARTIFICIAL neural networks ,CONVOLUTIONAL neural networks ,CALLIGRAPHY ,FEATURE extraction - Abstract
Calligraphy (the special art of drawing characters with a brush specially made by the Chinese) is an integral part of Chinese culture, and detecting Chinese calligraphy characters is highly significant. At present, there are still some challenges in the detection of ancient calligraphy. In this paper, we are interested in the calligraphy character detection problem focusing on the calligraphy character boundary. We chose High-Resolution Net (HRNet) as the calligraphy character feature extraction backbone network to learn reliable high-resolution representations. Then, we used the scale prediction branch and the spatial information prediction branch to detect the calligraphy character region and categorize the calligraphy character and its boundaries. We used the channel attention mechanism and the feature fusion method to improve the detection effectiveness in this process. Finally, we pre-trained with a self-generated calligraphy database and fine-tuned with a real calligraphy database. We set up two groups of ablation studies for comparison, and the comparison results proved the superiority of our method. This paper found that the classification of characters and character boundaries has a certain auxiliary effect on single character detection. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
50. A Study on Gear Defect Detection via Frequency Analysis Based on DNN.
- Author
-
Kim, Jeonghyeon, Kim, Jonghoek, and Kim, Hyuntai
- Subjects
ARTIFICIAL neural networks ,DEEP learning ,GEARING machinery vibration ,CONVOLUTIONAL neural networks - Abstract
In this paper, we introduce a gear defect detection system using frequency analysis based on deep learning. The existing defect diagnosis systems using acoustic analysis use spectrogram, scalogram, and MFCC (Mel-Frequency Cepstral Coefficient) images as inputs to the convolutional neural network (CNN) model to diagnose defects. However, using visualized acoustic data as input to the CNN models requires a lot of computation time. Although computing power has improved, there is a situation in which a processor with low performance is used for reasons such as cost-effectiveness. In this paper, only the sums of frequency bands are used as input to the deep neural network (DNN) model to diagnose the gear fault. This system diagnoses the defects using only a few specific frequency bands, so it ignores unnecessary data and does not require high performance when diagnosing defects because it uses a relatively simple deep learning model for classification. We evaluate the performance of the proposed system through experiments and verify that real-time diagnosis of gears is possible compared to the CNN model. The result showed 95.5% accuracy for 1000 test data, and it took 18.48 ms, so that verified the capability of real-time diagnosis in a low-spec environment. The proposed system is expected to be effectively used to diagnose defects in various sound-based facilities at a low cost. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.