239 results
Search Results
2. Survey on the research direction of EEG-based signal processing.
- Author
-
Congzhong Sun and Chaozhou Mou
- Subjects
ARTIFICIAL neural networks ,SIGNAL processing ,DATA augmentation ,GENERATIVE adversarial networks ,MACHINE learning - Abstract
Electroencephalography (EEG) is increasingly important in Brain-Computer Interface (BCI) systems due to its portability and simplicity. In this paper, we provide a comprehensive review of research on EEG signal processing techniques since 2021, with a focus on preprocessing, feature extraction, and classification methods. We analyzed 61 research articles retrieved from academic search engines, including CNKI, PubMed, Nature, IEEE Xplore, and Science Direct. For preprocessing, we focus on innovatively proposed preprocessing methods, channel selection, and data augmentation. Data augmentation is classified into conventional methods (sliding windows, segmentation and recombination, and noise injection) and deep learning methods [Generative Adversarial Networks (GAN) and Variation AutoEncoder (VAE)]. We also pay attention to the application of deep learning, and multi-method fusion approaches, including both conventional algorithm fusion and fusion between conventional algorithms and deep learning. Our analysis identifies 35 (57.4%), 18 (29.5%), and 37 (60.7%) studies in the directions of preprocessing, feature extraction, and classification, respectively. We find that preprocessing methods have become widely used in EEG classification (96.7% of reviewed papers) and comparative experiments have been conducted in some studies to validate preprocessing. We also discussed the adoption of channel selection and data augmentation and concluded several mentionable matters about data augmentation. Furthermore, deep learning methods have shown great promise in EEG classification, with Convolutional Neural Networks (CNNs) being the main structure of deep neural networks (92.3% of deep learning papers). We summarize and analyze several innovative neural networks, including CNNs and multi-structure fusion. However, we also identified several problems and limitations of current deep learning techniques in EEG classification, including inappropriate input, low cross-subject accuracy, unbalanced between parameters and time costs, and a lack of interpretability. Finally, we highlight the emerging trend of multi-method fusion approaches (49.2% of reviewed papers) and analyze the data and some examples. We also provide insights into some challenges of multi-method fusion. Our review lays a foundation for future studies to improve EEG classification performance. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
3. Utilization of Multi-Channel Hybrid Deep Neural Networks for Avocado Ripeness Classification.
- Author
-
Nuanmeesri, Sumitra
- Subjects
ARTIFICIAL neural networks ,CONVOLUTIONAL neural networks ,FEATURE extraction ,FRUIT ,CLASSIFICATION ,AVOCADO - Abstract
Ripeness classification is crucial in ensuring the quality and marketability of avocados. This paper aims to develop the Multi-Channel Hybrid Deep Neural Networks (MCHDNN) model between Visual Geometry Group 16 (VGG16) and EfficientNetB0 architectures, tailored explicitly for avocado ripeness classification in five classes: firm, breaking, ripe, overripe, and rotten. Each feature extracted is concatenated in an early fusion-based to classify the ripeness. The image dataset used for each avocado fruit was captured from six sides: front, back, left, right, bottom, and pedicel to provide a multi-channel input image in of a Convolution Neural Network (CNN). The results showed that the developed fine-tuned MCHDNN had an accuracy of 94.10% in training, 90.13% in validation, and 90.18% in testing. In addition, when considering individual class classification in the confusion matrix of the training set, it was found that the 'ripe' class had the highest accuracy of 94.58%, followed by the 'firm' and 'rotten' classes with 94.50% and 93.75% accuracy, respectively. Moreover, compared with the single-channel model, the fine-tuned MCHDNN model performs 7.70% more accurately than the fine-tuned VGG16 model and 7.77% more accurately than the fine-tuned EfficientNetB0 model. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. Classified VPN Network Traffic Flow Using Time Related to Artificial Neural Network.
- Author
-
Mohamed, Saad Abdalla Agaili and Kurnaz, Sefer
- Subjects
ARTIFICIAL neural networks ,COMPUTER network traffic ,COMPUTER network security ,TRAFFIC flow ,FEATURE extraction ,VIRTUAL private networks - Abstract
VPNs are vital for safeguarding communication routes in the continually changing cybersecurity world. However, increasing network attack complexity and variety require increasingly advanced algorithms to recognize and categorize VPN network data. We present a novel VPN network traffic flow classification method utilizing Artificial Neural Networks (ANN). This paper aims to provide a reliable system that can identify a virtual private network (VPN) traffic from intrusion attempts, data exfiltration, and denial-of-service assaults. We compile a broad dataset of labeled VPN traffic flows from various apps and usage patterns. Next, we create an ANN architecture that can handle encrypted communication and distinguish benign from dangerous actions. To effectively process and categorize encrypted packets, the neural network model has input, hidden, and output layers. We use advanced feature extraction approaches to improve the ANN's classification accuracy by leveraging network traffic's statistical and behavioral properties. We also use cutting-edge optimization methods to optimize network characteristics and performance. The suggested ANN-based categorization method is extensively tested and analyzed. Results show the model effectively classifies VPN traffic types. We also show that our ANN-based technique outperforms other approaches in precision, recall, and F1-score with 98.79% accuracy. This study improves VPN security and protects against new cyberthreats. Classifying VPN traffic flows effectively helps enterprises protect sensitive data, maintain network integrity, and respond quickly to security problems. This study advances network security and lays the groundwork for ANN-based cybersecurity solutions. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. Semi-Supervised Autoencoder for Chemical Gas Classification with FTIR Spectrum.
- Author
-
Jang, Hee-Deok, Kwon, Seokjoon, Nam, Hyunwoo, and Chang, Dong Eui
- Subjects
ARTIFICIAL neural networks ,CHEMICAL warfare agents ,CLASSIFICATION ,FEATURE extraction ,MATERIALS analysis - Abstract
Chemical warfare agents pose a serious threat due to their extreme toxicity, necessitating swift the identification of chemical gases and individual responses to the identified threats. Fourier transform infrared (FTIR) spectroscopy offers a method for remote material analysis, particularly in detecting colorless and odorless chemical agents. In this paper, we propose a deep neural network utilizing a semi-supervised autoencoder (SSAE) for the classification of chemical gases based on FTIR spectra. In contrast to traditional methods, the SSAE concurrently trains an autoencoder and a classifier attached to a latent vector of the autoencoder, enhancing feature extraction for classification. The SSAE was evaluated on laboratory-collected FTIR spectra, demonstrating a superior classification performance compared to existing methods. The efficacy of the SSAE lies in its ability to generate denser cluster distributions in latent vectors, thereby enhancing gas classification. This study established a consistent experimental environment for hyperparameter optimization, offering valuable insights into the influence of latent vectors on classification performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. Brain tumor segmentation and classification with hybrid clustering, probabilistic neural networks.
- Author
-
Javeed, M.D., Nagaraju, Regonda, Chandrasekaran, Raja, Rajulu, Govinda, Tumuluru, Praveen, Ramesh, M., Suman, Sanjay Kumar, and Shrivastava, Rajeev
- Subjects
ARTIFICIAL neural networks ,DEEP learning ,BRAIN tumors ,TUMOR classification ,ARTIFICIAL intelligence ,MAGNETIC resonance imaging - Abstract
The process of partitioning into different objects of an image is segmentation. In different major fields like face tracking, Satellite, Object Identification, Remote Sensing and majorly in medical field segmentation process is very important to find the different objects in the image. To investigate the functions and processes of human boy in radiology magnetic resonance imaging (MRI) will be used. MRI technique is using in many hospitals for the diagnosis purpose widely in finding the stage of a particular disease. In this paper, we proposed a new method for detecting the tumor with enhanced performance over traditional techniques such as K-Means Clustering, fuzzy c means (FCM). Different research methods have been proposed by researchers to detect the tumor in brain. To classify normal and abnormal form of brain, a system for screening is discussed in this paper which is developed with a framework of artificial intelligence with deep learning probabilistic neural networks by focusing on hybrid clustering for segmentation on brain image and crystal contrast enhancement. Feature's extraction and classification are included in the developing process. Performance in Simulation of proposed design has shown the superior results than the traditional methods. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
7. Early dementia detection with speech analysis and machine learning techniques.
- Author
-
Jahan, Zerin, Khan, Surbhi Bhatia, and Saraee, Mo
- Subjects
NATURAL language processing ,MACHINE learning ,SPEECH ,ARTIFICIAL neural networks ,DEMENTIA - Abstract
This in-depth study journey explores the context of natural language processing and text analysis in dementia detection, revealing their importance in a variety of fields. Beginning with an examination of the widespread and influence of text data. The dataset utilised in this study is from TalkBank's DementiaBank, which is basically a vast database of multimedia interactions built with the goal of examining communication patterns in the context of dementia. The various communication styles dementia patients exhibit when communicating with others are seen from a unique perspective by this specific dataset. Thorough data preprocessing procedures, including cleansing, tokenization, and structuring, are undertaken, with a focus on improving prediction capabilities through the combination of textual and non-textual information in the field of feature engineering. In the subsequent phase, the precision, recall, and F1-score metrics of Support Vector Machines (SVM), K-Nearest Neighbours (KNN), Random Forest, and Artificial Neural Networks (ANN) are assessed. Empirical facts are synthesized using text analysis methods and models to formulate a coherent conclusion. The significance of text data analysis, the revolutionary potential of natural language processing, and the direction for future research are highlighted in this synthesis. Throughout this paper, readers are encouraged to leverage text data to embark on their own adventures in the evolving, data-centric world of dementia detection. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. Mental arithmetic task detection using geometric features extraction of EEG signal based on machine learning.
- Author
-
Hoda Edris, ABADI, Mohammad Karimi, MORIDANI, and Mahshid, MIRZAKHANI
- Subjects
MENTAL arithmetic ,FEATURE extraction ,MACHINE learning ,ATTENTION-deficit hyperactivity disorder ,ARTIFICIAL neural networks - Abstract
BACKGROUND: Mental arithmetic analysis based on electroencephalogram (EEG) signals can help to understand some disorders such as attention deficit hyperactivity disorder, arithmetic disorder, or autism spectrum disorder in which learning is difficult. Most mental computation detection and classification systems rely on the characteristics of a single channel, however, the understanding of the connections between EEG channels, which certainly contains valuable information, is still evolving. The methods presented in this paper are the result of a research project that introduces an alternative method for better and faster receipt of information from the EEG signals of individuals, which are generally complex and nonlinear. METHODS: The EEGs of 66 healthy individuals were recorded in two rest modes and mental task a designed, with a sampling frequency of 500 Hz. To classify these two modes, we extracted features from our recordings to differentiate the EEG signals of these two groups in a single channel as well as combine possible channels. The new method that was proposed was the extraction of several geometric features from Poincaré design analysis, which used the necessary comparison t-test to determine brain differences, with a significance level of less than 0.05 in the state of mental calculations and facial rest. Also, an artificial neural network (ANN) has been used for automatic learning and diagnosis in the two mentioned modes. RESULTS: The results of this paper show that by using a combination of geometric properties (sides, angles, shortest distance, slope, and coefficients of the third-degree equation) using selected channels (FP1, F7, C4, O1) can achieve 100 % accuracy. The sensitivity reached 100 %. As well as 100 % feature. CONCLUSIONS: With the help of mental calculation, it is possible to diagnose, treat, rehabilitate and rehabilitation people who have lost the function of a part of their brain due to a disease in this field (Tab. 6, Fig. 15, Ref. 45). [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
9. Visual Attention-Driven Hyperspectral Image Classification.
- Author
-
Haut, Juan Mario, Paoletti, Mercedes E., Plaza, Javier, Plaza, Antonio, and Li, Jun
- Subjects
ARTIFICIAL neural networks ,CLASSIFICATION ,SYSTEM identification - Abstract
Deep neural networks (DNNs), including convolutional neural networks (CNNs) and residual networks (ResNets) models, are able to learn abstract representations from the input data by considering a deep hierarchy of layers that perform advanced feature extraction. The combination of these models with visual attention techniques can assist with the identification of the most representative parts of the data from a visual standpoint, obtained through more detailed filtering of the features extracted by the operational layers of the network. This is of significant interest for analyzing remotely sensed hyperspectral images (HSIs), characterized by their very high spectral dimensionality. However, few efforts have been conducted in the literature in order to adapt visual attention methods to remotely sensed HSI data analysis. In this paper, we introduce a new visual attention-driven technique for the HSI classification. Specifically, we incorporate attention mechanisms to a ResNet in order to better characterize the spectral–spatial information contained in the data. Our newly proposed method calculates a mask that is applied to the features obtained by the network in order to identify the most desirable ones for classification purposes. Our experiments, conducted using four widely used HSI data sets, reveal that the proposed deep attention model provides competitive advantages in terms of classification accuracy when compared to other state-of-the-art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
10. LSTM Neural Network for Beat Classification in ECG Identity Recognition.
- Author
-
Xin Liu, Yujuan Si, and Di Wang
- Subjects
DEEP learning ,ELECTROCARDIOGRAPHY ,LYAPUNOV exponents ,CLASSIFICATION ,FEATURE extraction ,ARTIFICIAL neural networks - Abstract
As a biological signal existing in the human living body, the electrocardiogram (ECG) contains abundantly personal information and fulfils the basic characteristics of identity recognition. It has been widely used in the field of individual identification research in recent years. The common process of identity recognition includes three steps: ECG signals preprocessing, feature extraction and processing, beat classification recognition. However, the existing ECG classification models are sensitive to limitations of database type and extracted features dimension, which makes classification accuracy difficult to improve and cannot meet the needs of practical applications. To tackle the problem, this paper proposes to build an ECG individual recognition model based on a deep Long Short-Term Memory (LSTM) neural network. The LSTM network model has a memory cell and, therefore, it is an expert in handling long time ECG signals. With deeper learning, the nonlinear expression ability of the ECG beat classification model is gradually enhancing. The paper adopts two stacked LSTM models as hidden layers in the neural network; the Softmax layer is used as a classification layer to identify an individual. Then, low-level morphological features and deep-level chaotic features (Lyapunov exponent) are extracted to verify the feasibility of the deep LSTM network for classification. The model is respectively applied to a healthy human database and a human with a heart disease database. Experimental results show that extracting simple low-level features and chaotic features both achieve better classification performance. So, the robustness of the LSTM classification model is verified. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
11. Extraction of Significant Features by Fixed-Weight Layer of Processing Elements for the Development of an Efficient Spiking Neural Network Classifier.
- Author
-
Sboev, Alexander, Rybka, Roman, Kunitsyn, Dmitry, Serenko, Alexey, Ilyin, Vyacheslav, and Putrolaynen, Vadim
- Subjects
ARTIFICIAL neural networks ,FEATURE extraction ,DISTRIBUTION (Probability theory) ,LOGISTIC regression analysis ,BREAST cancer ,ENERGY consumption - Abstract
In this paper, we demonstrate that fixed-weight layers generated from random distribution or logistic functions can effectively extract significant features from input data, resulting in high accuracy on a variety of tasks, including Fisher's Iris, Wisconsin Breast Cancer, and MNIST datasets. We have observed that logistic functions yield high accuracy with less dispersion in results. We have also assessed the precision of our approach under conditions of minimizing the number of spikes generated in the network. It is practically useful for reducing energy consumption in spiking neural networks. Our findings reveal that the proposed method demonstrates the highest accuracy on Fisher's iris and MNIST datasets with decoding using logistic regression. Furthermore, they surpass the accuracy of the conventional (non-spiking) approach using only logistic regression in the case of Wisconsin Breast Cancer. We have also investigated the impact of non-stochastic spike generation on accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
12. Detection of Glaucoma from Fundus Images Using Novel Evolutionary-Based Deep Neural Network.
- Author
-
Madhumalini, M. and Devi, T. Meera
- Subjects
GLAUCOMA diagnosis ,EXPERIMENTAL design ,COMPUTERS in medicine ,RETINA ,PREDICTIVE tests ,DIAGNOSTIC imaging ,EYE ,OPTIC nerve ,OPTICAL coherence tomography ,ARTIFICIAL neural networks ,SENSITIVITY & specificity (Statistics) ,STATISTICAL models ,EYE examination - Abstract
Glaucoma is an asymptotic condition that damages the optic nerves of a human eye. Glaucoma is frequently caused due to abnormally high pressure in an eye that leads to permanent blindness. Detecting glaucoma at an initial phase has the possibility of curing this disease, but diagnosing accurately is considered as a challenging task. Therefore, this paper proposes a novel method known as a glaucoma detection system that performs the diagnosis of glaucoma by exploiting the prescribed characteristics. The significant intention of this paper involves diagnosing the glaucoma disease present at the top optical nerve of a human eye. The proposed glaucoma detection has used four different phases namely data preprocessing or enhancement phase, segmentation phase, feature extraction phase, and classification phase. Here, a novel classifier named fractional gravitational search-based hybrid deep neural network (FGSA-HDNN) is developed for the effective classification of glaucoma-infected images from the normal image. Finally, the experimental analysis for the proposed approach and various other techniques are performed, and the accuracy rate while diagnosing glaucoma achieved is 98.75%. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
13. A CNN With Multiscale Convolution and Diversified Metric for Hyperspectral Image Classification.
- Author
-
Gong, Zhiqiang, Zhong, Ping, Yu, Yang, Hu, Weidong, and Li, Shutao
- Subjects
POINT processes ,ARTIFICIAL neural networks ,IMAGE ,CLASSIFICATION ,TASK analysis ,MATHEMATICAL convolutions - Abstract
Recently, researchers have shown the powerful ability of deep methods with multilayers to extract high-level features and to obtain better performance for hyperspectral image classification. However, a common problem of traditional deep models is that the learned deep models might be suboptimal because of the limited number of training samples, especially for the image with large intraclass variance and low interclass variance. In this paper, novel convolutional neural networks (CNNs) with multiscale convolution (MS-CNNs) are proposed to address this problem by extracting deep multiscale features from the hyperspectral image. Moreover, deep metrics usually accompany with MS-CNNs to improve the representational ability for the hyperspectral image. However, the usual metric learning would make the metric parameters in the learned model tend to behave similarly. This similarity leads to obvious model’s redundancy and, thus, shows negative effects on the description ability of the deep metrics. Traditionally, determinantal point process (DPP) priors, which encourage the learned factors to repulse from one another, can be imposed over these factors to diversify them. Taking advantage of both the MS-CNNs and DPP-based diversity-promoting deep metrics, this paper develops a CNN with multiscale convolution and diversified metric to obtain discriminative features for hyperspectral image classification. Experiments are conducted over four real-world hyperspectral image data sets to show the effectiveness and applicability of the proposed method. Experimental results show that our method is better than original deep models and can produce comparable or even better classification performance in different hyperspectral image data sets with respect to spectral and spectral–spatial features. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
14. Improved Siamese Palmprint Authentication Using Pre-Trained VGG16-Palmprint and Element-Wise Absolute Difference.
- Author
-
Ezz, Mohamed, Alanazi, Waad, Mostafa, Ayman Mohamed, Hamouda, Eslam, Elbashir, Murtada K., and Alruily, Meshrif
- Subjects
PALMPRINT recognition ,BIOMETRIC identification ,DEEP learning ,MACHINE learning ,ARTIFICIAL neural networks - Abstract
Palmprint identification has been conducted over the last two decades in many biometric systems. High-dimensional data with many uncorrelated and duplicated features remains difficult due to several computational complexity issues. This paper presents an interactive authentication approach based on deep learning and feature selection that supports Palmprint authentication. The proposed model has two stages of learning; the first stage is to transfer pre-trained VGG-16 of ImageNet to specific features based on the extraction model. The second stage involves the VGG-16 Palmprint feature extraction in the Siamese network to learn Palmprint similarity. The proposed model achieves robust and reliable end-to-end Palmprint authentication by extracting the convolutional features using VGG-16 Palmprint and the similarity of two input Palmprint using the Siamese network. The second stage uses the CASIA dataset to train and test the Siamese network. The suggested model outperforms comparable studies based on the deep learning approach achieving accuracy and EER of 91.8% and 0.082%, respectively, on the CASIA left-hand images and accuracy and EER of 91.7% and 0.084, respectively, on the CASIA right-hand images. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
15. SensorNet: A Scalable and Low-Power Deep Convolutional Neural Network for Multimodal Data Classification.
- Author
-
Jafari, Ali, Ganesan, Ashwinkumar, Thalisetty, Chetan Sai Kumar, Sivasubramanian, Varun, Oates, Tim, and Mohsenin, Tinoosh
- Subjects
ARTIFICIAL neural networks ,SIGNAL processing ,TIME series analysis - Abstract
This paper presents SensorNet which is a scalable and low-power embedded deep convolutional neural network (DCNN), designed to classify multimodal time series signals. Time series signals generated by different sensor modalities with different sampling rates are first converted to images (2-D signals), and then DCNN is utilized to automatically learn shared features in the images and perform the classification. SensorNet: 1) is scalable as it can process different types of time series data with variety of input channels and sampling rates; 2) does not need to employ separate signal processing techniques for processing the data generated by each sensor modality; 3) does not require expert knowledge for extracting features for each sensor data; 4) makes it easy and fast to adapt to new sensor modalities with a different sampling rate; 5) achieves very high detection accuracy for different case studies; and 6) has a very efficient architecture which makes it suitable to be deployed at Internet of Things and wearable devices. A custom low-power hardware architecture is also designed for the efficient deployment of SensorNet at embedded real-time systems. SensorNet performance is evaluated using three different case studies including physical activity monitoring, stand-alone tongue drive system, and stress detection, and it achieves an average detection accuracy of 98%, 96.2%, and 94% for each case study, respectively. We implement SensorNet using our custom hardware architecture on Xilinx FPGA (Artix-7) which on average consumes 246- $\mu \text{J}$ energy. To further reduce the power consumption, SensorNet is implemented using application-specified integrated circuit at the post layout level in 65-nm CMOS technology which consumes approximately $8\times $ lower power compared to the FPGA implementation. In addition, SensorNet is implemented on NVIDIA Jetson TX2 SoC (CPU + GPU) and compared to TX2 single-core CPU and GPU implementations, FPGA-based SensorNet obtains $15\times $ and $4\times $ improvement in energy consumption. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
16. The short texts classification based on neural network topic model.
- Author
-
Shao, Dangguo, Li, Chengyao, Huang, Chusheng, An, Qing, Xiang, Yan, Guo, Junjun, and He, Jianfeng
- Subjects
ARTIFICIAL neural networks ,GAUSSIAN mixture models ,FEATURE extraction ,GAUSSIAN distribution ,CLASSIFICATION - Abstract
Aiming at the low effectiveness of short texts feature extraction, this paper proposes a short texts classification model based on the improved Wasserstein-Latent Dirichlet Allocation (W-LDA), which is a neural network topic model based on the Wasserstein Auto-Encoder (WAE) framework. The improvements of W-LDA are as follows: Firstly, the Bag of Words (BOW) input in the W-LDA is preprocessed by Term Frequency–Inverse Document Frequency (TF-IDF); Subsequently, the prior distribution of potential topics in W-LDA is replaced from the Dirichlet distribution to the Gaussian mixture distribution, which is based on the Variational Bayesian inference; And then the sparsemax function layer is introduced after the hidden layer inferred by the encoder network to generate a sparse document-topic distribution with better topic relevance, the improved W-LDA is named the Sparse Wasserstein-Variational Bayesian Gaussian mixture model (SW-VBGMM); Finally, the document-topic distribution generated by SW-VBGMM is input to BiGRU (Bidirectional Gating Recurrent Unit) for the deep feature extraction and the short texts classification. Experiments on three Chinese short texts datasets and one English dataset represent that our model is better than some common topic models and neural network models in the four evaluation indexes (accuracy, precision, recall, F1 value) of text classification. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
17. A Comprehensive survey on ear recognition: Databases, approaches, comparative analysis, and open challenges.
- Author
-
Benzaoui, Amir, Khaldi, Yacine, Bouaouina, Rafik, Amrouni, Nadia, Alshazly, Hammam, and Ouahabi, Abdeldjalil
- Subjects
- *
ARTIFICIAL neural networks , *EAR , *FEATURE extraction , *DEEP learning , *COMPARATIVE studies - Abstract
Automatic identity recognition from ear images is an active research topic in the biometric community. The ability to secretly acquire images of the ear remotely and the stability of the ear shape over time make this technology a promising alternative for surveillance, authentication, and forensic applications. In recent years, significant research has been conducted in this area. Nevertheless, challenges remain that limit the commercial use of this technology. Several phases of the ear recognition system have been studied in the literature, from ear detection, normalization, and feature extraction to classification. This paper reviews the most recent methods used to describe and classify biometric features of the ear. We propose a first taxonomy to group existing approaches to ear recognition, including 2D, 3D, and combined 2D and 3D methods, as well as an overview of historical advances in this field. It is well known that data and algorithms are the essential components in biometrics, particularly in-ear recognition. However, early ear recognition datasets were very limited and collected in laboratory with controlled environments. With the wider use of deep neural networks, a considerable amount of training data has become necessary if acceptable ear recognition performance is to be achieved. As a consequence, current ear recognition datasets have increased significantly in size. This paper gives an overview of the chronological evolution of ear recognition datasets and compares the performance of conventional vs. deep learning methods on several datasets. We proposed a second taxonomy to classify the existing databases, including 2D, 3D, and video ear datasets. Finally, some open challenges and trends are debated for future research. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
18. White Blood Cells Classification Using Entropy-Controlled Deep Features Optimization.
- Author
-
Ahmad, Riaz, Awais, Muhammad, Kausar, Nabeela, and Akram, Tallha
- Subjects
ARTIFICIAL neural networks ,LEUCOCYTES ,CONVOLUTIONAL neural networks ,FEATURE selection ,FEATURE extraction - Abstract
White blood cells (WBCs) constitute an essential part of the human immune system. The correct identification of WBC subtypes is critical in the diagnosis of leukemia, a kind of blood cancer defined by the aberrant proliferation of malignant leukocytes in the bone marrow. The traditional approach of classifying WBCs, which involves the visual analysis of blood smear images, is labor-intensive and error-prone. Modern approaches based on deep convolutional neural networks provide significant results for this type of image categorization, but have high processing and implementation costs owing to very large feature sets. This paper presents an improved hybrid approach for efficient WBC subtype classification. First, optimum deep features are extracted from enhanced and segmented WBC images using transfer learning on pre-trained deep neural networks, i.e., DenseNet201 and Darknet53. The serially fused feature vector is then filtered using an entropy-controlled marine predator algorithm (ECMPA). This nature-inspired meta-heuristic optimization algorithm selects the most dominant features while discarding the weak ones. The reduced feature vector is classified with multiple baseline classifiers with various kernel settings. The proposed methodology is validated on a public dataset of 5000 synthetic images that correspond to five different subtypes of WBCs. The system achieves an overall average accuracy of 99.9 % with more than 95 % reduction in the size of the feature vector. The feature selection algorithm also demonstrates better convergence performance as compared to classical meta-heuristic algorithms. The proposed method also demonstrates a comparable performance with several existing works on WBC classification. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
19. Skip-Connected Covariance Network for Remote Sensing Scene Classification.
- Author
-
He, Nanjun, Fang, Leyuan, Li, Shutao, Plaza, Javier, and Plaza, Antonio
- Subjects
REMOTE sensing ,ARTIFICIAL neural networks ,CLASSIFICATION ,GENE regulatory networks - Abstract
This paper proposes a novel end-to-end learning model, called skip-connected covariance (SCCov) network, for remote sensing scene classification (RSSC). The innovative contribution of this paper is to embed two novel modules into the traditional convolutional neural network (CNN) model, i.e., skip connections and covariance pooling. The advantages of newly developed SCCov are twofold. First, by means of the skip connections, the multi-resolution feature maps produced by the CNN are combined together, which provides important benefits to address the presence of large-scale variance in RSSC data sets. Second, by using covariance pooling, we can fully exploit the second-order information contained in such multi-resolution feature maps. This allows the CNN to achieve more representative feature learning when dealing with RSSC problems. Experimental results, conducted using three large-scale benchmark data sets, demonstrate that our newly proposed SCCov network exhibits very competitive or superior classification performance when compared with the current state-of-the-art RSSC techniques, using a much lower amount of parameters. Specifically, our SCCov only needs 10% of the parameters used by its counterparts. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
20. Multiscale Convolutional Neural Networks for Fault Diagnosis of Wind Turbine Gearbox.
- Author
-
Jiang, Guoqian, He, Haibo, Yan, Jun, and Xie, Ping
- Subjects
ARTIFICIAL neural networks ,SIGNAL convolution ,MULTISCALE modeling ,ARTIFICIAL intelligence ,FAULT diagnosis ,DEEP learning ,WIND turbines ,GEARBOXES - Abstract
This paper proposes a novel intelligent fault diagnosis method to automatically identify different health conditions of wind turbine (WT) gearbox. Unlike traditional approaches, where feature extraction and classification are separately designed and performed, this paper aims to automatically learn effective fault features directly from raw vibration signals while classify the type of faults in a single framework, thus providing an end-to-end learning-based fault diagnosis system for WT gearbox without additional signal processing and diagnostic expertise. Considering the multiscale characteristics inherent in vibration signals of a gearbox, a new multiscale convolutional neural network (MSCNN) architecture is proposed to perform multiscale feature extraction and classification simultaneously. The proposed MSCNN incorporates multiscale learning into the traditional CNN architecture, which has two merits: 1) high-level fault features can be effectively learned by the hierarchical learning structure with multiple pairs of convolutional and pooling layers; and 2) multiscale learning scheme can capture complementary and rich diagnosis information at different scales. This greatly improves the feature learning ability and enables better diagnosis performance. The proposed MSCNN approach is evaluated through experiments on a WT gearbox test rig. Experimental results and comprehensive comparison analysis with respect to the traditional CNN and traditional multiscale feature extractors have demonstrated the superiority of the proposed method. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
21. Survey and Evaluation of Neural 3D Shape Classification Approaches.
- Author
-
Mirbauer, Martin, Krabec, Miroslav, Krivanek, Jaroslav, and Sikudova, Elena
- Subjects
ARTIFICIAL neural networks ,MACHINE learning ,OBJECT recognition (Computer vision) ,CLASSIFICATION ,GEOMETRIC shapes ,COMPUTER graphics ,CLASSIFICATION algorithms - Abstract
Classification of 3D objects – the selection of a category in which each object belongs – is of great interest in the field of machine learning. Numerous researchers use deep neural networks to address this problem, altering the network architecture and representation of the 3D shape used as an input. To investigate the effectiveness of their approaches, we conduct an extensive survey of existing methods and identify common ideas by which we categorize them into a taxonomy. Second, we evaluate 11 selected classification networks on two 3D object datasets, extending the evaluation to a larger dataset on which most of the selected approaches have not been tested yet. For this, we provide a framework for converting shapes from common 3D mesh formats into formats native to each network, and for training and evaluating different classification approaches on this data. Despite being partially unable to reach the accuracies reported in the original papers, we compare the relative performance of the approaches as well as their performance when changing datasets as the only variable to provide valuable insights into performance on different kinds of data. We make our code available to simplify running training experiments with multiple neural networks with different prerequisites. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
22. Classification of vein pattern recognition using hybrid deep learning.
- Author
-
Gopinath, P. and Shivakumar, R.
- Subjects
DEEP learning ,ARTIFICIAL neural networks ,BLENDED learning ,FINGERS ,PATTERN recognition systems ,RECURRENT neural networks ,CONVOLUTIONAL neural networks ,VEINS - Abstract
Recognition of finger vein patterns is essential technique that analyses the finger vein patterns to enable accurate authentication of an individual. A proper, accurate and quick learning of patterns is essentially required for improving the classification pattern. It is essential in developing an intelligent algorithm to effectively study and classify the patterns. In this paper, we develop an improved deep learning hybrid model for feature extraction and classification. A dimensional reduction deep neural network (DR-DNN) model has included a dimensional reduction model for extracting the essential features by reducing the dimensionality of feature datasets. A convolutional neural network (CNN) helps in classifying the benign vein patterns from the malignant vein patterns. The effectiveness is compared against existing deep learning classifiers to measure how effective the deep learning model is used for classifying finger vein patterns for biometric authentication. The results shows that the proposed method achieves an accuracy rate of 97.16% for the proposed method, where the other existing methods including CNN, Recurrent Neural Network (RNN) and Deep Neural Nets (DNN) has an accuracy rate of 86%, 80.66% and 88.31%, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
23. Robust Visual Lips Feature Extraction Method for Improved Visual Speech Recognition System.
- Author
-
Mahmmed, Mahmuod.H., Saeed, Thamir.R., and Wissam, H. Ali
- Subjects
SPEECH perception ,HUMAN-computer interaction ,PATTERN recognition systems ,ARTIFICIAL neural networks ,AUTOMATIC speech recognition - Abstract
Recently, automatic lips reading ALR acquired a significant interest among many researchers due to its adoption in many applications. One such application is in speech recognition system in noisy environment, where visual cue that contain some integral information added to the audio signal, as well as the way that person merges audio-visual stimulus to identify utterance. The unsolved part of this problem is the utterance classification using only the visual cues without the availability of acoustic signal of the talker's speech. By taking into considerations a set of frames from recorded video for a person uttering a word; a robust image processing technique is used to isolate the lips region, then suitable features are extracted that represent the mouth shape variation during speech. These features are used by the classification stage to identify the uttered word. This paper is solve this problem by introducing a new segmentation technique to isolate the lips region together with a set of visual features base on the extracted lips boundary which able to perform lips reading with significant result. A special laboratory is designed to collect the utterance of twenty six English letters from a multiple speakers which are adopted in this paper (UOTEletters corpus). Moreover; two type of classifier (using Numeral Virtual generalization (NVG) RAM and K nearest neighborhood KNN) where adopted to identify the talker's utterance. The recognition performance for the input visual utterance when using NVG RAM is 94.679%, which is utilized for the first time in this work. While; 92.628% when KNN is utilize. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
24. Deep Learning-Based Classification and Reconstruction of Residential Scenes From Large-Scale Point Clouds.
- Author
-
Zhang, Liqiang and Zhang, Liang
- Subjects
REMOTE-sensing images ,IMAGE reconstruction ,DEEP learning ,ARTIFICIAL neural networks ,REINFORCEMENT learning ,AUTOMATICITY (Learning process) ,METROPOLITAN areas - Abstract
The reconstruction of urban buildings from large-scale airborne laser scanning point clouds is an important research topic in the geoscience field. Large-scale urban scenes usually contain a large number of object categories and many overlapped or closely neighboring objects, which poses great challenges for classifying and modeling buildings from these data sets. In this paper, we propose a deep reinforcement learning framework that integrates a 3-D convolutional neural network, a deep Q-network, and a residual recurrent neural network for the efficient semantic parsing of large-scale 3-D point clouds. The proposed framework provides an end-to-end automatic processing method that maps the raw point cloud to the classification results of the given categories. After obtaining the building classes, we utilize an edge-aware resampling algorithm to consolidate the point set with noise-free normals and clean preservation of sharp features. Finally, 2.5-D dual contouring, which is a data-driven approach, is introduced to generate urban building models from the consolidated point clouds. Our method can generate lightweight building models with arbitrarily shaped roofs while preserving the verticality of connecting walls. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
25. Deep learning and evolutionary intelligence with fusion-based feature extraction for detection of COVID-19 from chest X-ray images.
- Author
-
Shankar, K., Perumal, Eswaran, Tiwari, Prayag, Shorfuzzaman, Mohammad, and Gupta, Deepak
- Subjects
X-ray imaging ,FEATURE extraction ,DEEP learning ,METAHEURISTIC algorithms ,ARTIFICIAL neural networks ,COVID-19 ,COVID-19 testing - Abstract
In recent times, COVID-19 infection gets increased exponentially with the existence of a restricted number of rapid testing kits. Several studies have reported the COVID-19 diagnosis model from chest X-ray images. But the diagnosis of COVID-19 patients from chest X-ray images is a tedious process as the bilateral modifications are considered an ill-posed problem. This paper presents a new metaheuristic-based fusion model for COVID-19 diagnosis using chest X-ray images. The proposed model comprises different preprocessing, feature extraction, and classification processes. Initially, the Weiner filtering (WF) technique is used for the preprocessing of images. Then, the fusion-based feature extraction process takes place by the incorporation of gray level co-occurrence matrix (GLCM), gray level run length matrix (GLRM), and local binary patterns (LBP). Afterward, the salp swarm algorithm (SSA) selected the optimal feature subset. Finally, an artificial neural network (ANN) is applied as a classification process to classify infected and healthy patients. The proposed model's performance has been assessed using the Chest X-ray image dataset, and the results are examined under diverse aspects. The obtained results confirmed the presented model's superior performance over the state of art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
26. A framework for automated bone age assessment from digital hand radiographs.
- Author
-
Simu, Shreyas and Lal, Shyam
- Subjects
BONE aging ,ARTIFICIAL neural networks ,RADIOGRAPHS ,FEATURE extraction ,GROWTH disorders - Abstract
Bone age assessment (BAA) is a method or technique that helps in predicting the age of a person whose age is unavailable and can also be used to find growth disorders if any. The automated bone age assessment system (ABAA) depends heavily on the efficiency of the feature extraction stage and the accuracy of a successive classification stage of the system. This paper has presented the implementation and analysis of feature extraction methods like Bag of features (BoF), Histogram of Oriented Gradients (HOG), and Texture Feature Analysis (TFA) methods on the segmented phalangeal region of interest (PROI) images and segmented radius-ulna region of interest (RUROI) images. Artificial Neural Networks (ANN) and Random Forest classifiers are used for evaluating classification problems. The experimental results obtained by BoF method for feature extraction along with Random Forest for classification have outperformed preceding techniques available in the literature. The mean error (ME) accomplished is 0.58 years and RMSE value of 0.77 years for PROI images and mean error of 0.53 years and RMSE of 0.72 years was achieved for RUROI images. Additionally results also proved that prior knowledge of gender of the person gives better results. The dataset contains radiographs of the left hand for an age range of 0-18 years. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
27. Classification of Faults in Multicore Cable via Time–Frequency Domain Reflectometry.
- Author
-
Bang, Su Sik and Shin, Yong-June
- Subjects
ARTIFICIAL neural networks ,REFLECTOMETRY - Abstract
Owing to the increasing complexity of electrical systems, diagnostic techniques of cables used for connecting electrical elements are essential for system maintenance in order to prevent a failure that can cause significant impacts on the overall electrical systems. Multicore structures are typically used as control and instrumentation cables in nuclear power plants, and the failure of the control and instrumentation cables can result in a disaster such as a radiation leak. In this paper, a method for the diagnosis of multicore cables is proposed based on the reflectometry. The diagnosis relates to the classification of defective cores in joint, which is one of the weakest parts in cable systems. The reflected signals obtained through reflectometry are converted into images by an advanced image processing algorithm, and the images are classified using artificial neural networks. The proposed method is demonstrated by experimental data using a real-world multicore cable. In the experiment, the faults are emulated similar to real-world defects using a potentiometer. It is expected that the proposed technique will enhance the stability and reliability of multicore cable systems. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
28. An enhanced swarm optimization-based deep neural network for diabetic retinopathy classification in fundus images.
- Author
-
Dayana, A. Mary and Emmanuel, W. R. Sam
- Subjects
ARTIFICIAL neural networks ,DIABETIC retinopathy ,FUNDUS oculi ,RETINAL diseases ,FEATURE extraction ,GABOR filters ,LOW vision - Abstract
Diabetic Retinopathy (DR) is one of the long-lasting Diabetic retinal disorders that leads to vision impairment eventually blindness in most of the working-age population. The process of classifying the severity level of DR has been a great challenging task as the lesion features are hard to analyze. The screening process requires an effective detection method to classify the subtle pathologies of the retina. Deep neural architectures play a vital role in diagnosing eye disease and helps ophthalmologists to provide timely treatment. This paper proposes an efficient, optimized deep neural network with Chronological Tunicate Swarm Algorithm (CTSA) for classifying the severity of DR. Initially, the retinal images captured through the low-quality fundus photography are preprocessed and then subjected to the segmentation process. First, the optic disc and the blood vasculatures are segmented using a U-Net and sparse Fuzzy C-means-based hybrid entropy model. The lesion area is then detected using the Gabor filter banks, and then the features are extracted. The final classification process takes place using a deep Stacked Autoencoder (SAE) jointly optimized with a bio-inspired Tunicate Swarm Algorithm based on the chronological concept. The presented model achieved an average accuracy, sensitivity, specificity and F1-Score values of 95.9%, 88.07%, 96.80% and 85.26% for the DIARETDB0 database and 95.48%, 93.29%, 91.89% and 90.53% for the DIARETDB1 database. The experimental outcome demonstrates the effectiveness and the robustness of the proposed method in the DR classification task. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
29. Fusion of a Static and Dynamic Convolutional Neural Network for Multiview 3D Point Cloud Classification.
- Author
-
Wang, Wenju, Zhou, Haoran, Chen, Gang, and Wang, Xiaolin
- Subjects
DEEP learning ,CONVOLUTIONAL neural networks ,POINT cloud ,ARTIFICIAL neural networks ,FEATURE extraction ,CLASSIFICATION - Abstract
Three-dimensional (3D) point cloud classification methods based on deep learning have good classification performance; however, they adapt poorly to diverse datasets and their classification accuracy must be improved. Therefore, FSDCNet, a neural network model based on the fusion of static and dynamic convolution, is proposed and applied for multiview 3D point cloud classification in this paper. FSDCNet devises a view selection method with fixed and random viewpoints, which effectively avoids the overfitting caused by the traditional fixed viewpoint. A local feature extraction operator of dynamic and static convolution adaptive weight fusion was designed to improve the model's adaptability to different types of datasets. To address the problems of large parameters and high computational complexity associated with the current methods of dynamic convolution, a lightweight and adaptive dynamic convolution operator was developed. In addition, FSDCNet builds a global attention pooling, integrating the most crucial information on different view features to the greatest extent. Due to these characteristics, FSDCNet is more adaptable, can extract more fine-grained detailed information, and can improve the classification accuracy of point cloud data. The proposed method was applied to the ModelNet40 and Sydney Urban Objects datasets. In these experiments, FSDCNet outperformed its counterparts, achieving state-of-the-art point cloud classification accuracy. For the ModelNet40 dataset, the overall accuracy (OA) and average accuracy (AA) of FSDCNet in a single view reached 93.8% and 91.2%, respectively, which were superior to those values for many other methods using 6 and 12 views. FSDCNet obtained the best results for 6 and 12 views, achieving 94.6%, 93.3%, 95.3%, and 93.6% in OA and AA metrics, respectively. For the Sydney Urban Objects dataset, FSDCNet achieved an OA and F1 score of 81.2% and 80.1% in a single view, respectively, which were higher than most of the compared methods. In 6 and 12 views, FSDCNet reached an OA of 85.3% and 83.6% and an F1 score of 85.5% and 83.7%, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
30. ECG FEATURE EXTRACTION AND PARAMETER EVALUATION FOR DETECTION OF HEART ARRHYTHMIAS.
- Author
-
GANDHAM SREEDEVI and BHUMA ANURADHA
- Subjects
ELECTROCARDIOGRAPHY ,ARRHYTHMIA diagnosis ,HEART beat ,HEART diseases ,ARTIFICIAL neural networks - Abstract
ECG analysis continues to play a vital role in the primary diagnosis and prognosis of cardiac ailments. This paper presents a new approach to classification of ECG signals based on feature extraction and Artificial Neural Network (ANN) using Discrete Wavelet Transform (DWT). Nineteen ECG signals from MIT-BIH database were used to test the performance of proposed method. A 97.12% of sensitivity and 94.37% of positive predictivity were reported in this test for QRS complex detection. Arrhythmias detected were bradycardia, tachycardia, premature ventricular contraction, supraventricular tachycardia, and myocardial infarction. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
31. Real Time Feature Extraction Deep-CNN for Mask Detection.
- Author
-
Hosni Mahmoud, Hanan A., Alghamdi, Norah S., and Alharbi, Amal H.
- Subjects
FEATURE extraction ,ARTIFICIAL neural networks ,OBJECT recognition (Computer vision) ,COVID-19 ,CONVOLUTIONAL neural networks ,PUBLIC spaces ,SIGNAL convolution - Abstract
COVID-19 pandemic outbreak became one of the serious threats to humans. As there is no cure yet for this virus, we have to control the spread of Coronavirus through precautions. One of the effective precautions as announced by the World Health Organization is mask wearing. Surveillance systems in crowded places can lead to detection of people wearing masks. Therefore, it is highly urgent for computerized mask detection methods that can operate in real-time. As for now, most countries demand mask-wearing in public places to avoid the spreading of this virus. In this paper, we are presenting an object detection technique using a single camera, which presents real-time mask detection in closed places. Our contributions are as follows: 1) presenting a real time feature extraction module to improve the detection computational time; 2) enhancing the extracted features learned from the deep convolutional neural network models to improve small objects detection. The proposed model is a lightweight backbone CNN which ensures real time mask detection. The accuracy is also enhanced by utilizing the feature enhancement module after some of the convolution layers in the CNN. We performed extensive experiments comparing our model to the single-shot detector (SDD) and YoloV3 neural network models, which are the state-of-the-art models in the literature. The comparison shows that the result of our proposed model achieves 95.9% accuracy which is 21% higher than SSD and 17.7% higher than YoloV3 accuracy. We also conducted experiments testing the mask detection speed. It was found that our model achieves average detection time of 0.85s for images of size 1024 × 1024 pixels, which is better than the speed achieved by SSD but slightly less than the speed of YoloV3. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
32. Automatic segmentation and melanoma detection based on color and texture features in dermoscopic images.
- Author
-
Oukil, S., Kasmi, R., Mokrani, K., and García‐Zapirain, B.
- Subjects
FEATURE extraction ,ARTIFICIAL neural networks ,DERMOSCOPY ,COMPUTER-aided diagnosis ,MELANOMA ,K-nearest neighbor classification ,SUPPORT vector machines ,IMAGE segmentation - Abstract
Purpose: Melanoma is known as the most aggressive form of skin cancer and one of the fastest growing malignant tumors worldwide. Several computer‐aided diagnosis systems for melanoma have been proposed, still, the algorithms encounter difficulties in the early stage of lesions. This paper aims to discriminate melanoma and benign skin lesion in dermoscopic images. Methods: The proposed algorithm is based on the color and texture of skin lesions by introducing a novel feature extraction technique. The algorithm uses an automatic segmentation based on k‐means generating a fairly accurate mask for each lesion. The feature extraction consists of the existing and novel color and texture attributes measuring how color and texture vary inside the lesion. To find the optimal results, all the attributes are extracted from lesions in five different color spaces (RGB, HSV, Lab, XYZ, and YCbCr) and used as the inputs for three classifiers (K nearest neighbors, support vector machine , and artificial neural network). Results: The PH2 set is used to assess the performance of the proposed algorithm. The results of our algorithm are compared to the results of published articles that used the same dataset, and it shows that the proposed method outperforms the state of the art by attaining a sensitivity of 99.25%, specificity of 99.58%, and accuracy of 99.51%. Conclusion: The final results show that the colors combined with texture are powerful and relevant attributes for melanoma detection and show improvement over the state of the art. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
33. Full-Waveform Airborne LiDAR Data Classification Using Convolutional Neural Networks.
- Author
-
Zorzi, Stefano, Maset, Eleonora, Fusiello, Andrea, and Crosilla, Fabio
- Subjects
ARTIFICIAL neural networks ,LIDAR ,OPTICAL radar ,OPTICAL scanners ,SURFACE emitting lasers ,ELECTRONIC data processing ,CLASSIFICATION - Abstract
Point cloud classification is one of the most important and time-consuming stages of airborne LiDAR (Light Detection and Ranging) data processing, playing a key role in the generation of cartographic products. This paper describes an innovative algorithm to perform LiDAR point-cloud classification, which relies on Convolutional Neural Networks (CNNs) and takes advantage of full-waveform data registered by modern laser scanners. The proposed method consists of two steps. First, a simple CNN is used to preprocess each waveform, providing a compact representation of the data. By exploiting the coordinates of the points associated with the waveforms, output vectors generated by the first CNN are then mapped into an image that is subsequently segmented by a Fully Convolutional Network (FCN): a label is assigned to each pixel and, consequently, to the point falling in the pixel. In this way, spatial positions and geometrical relationships between neighboring data are taken into account. These particular architectures allow to accurately identify even challenging classes such as power line and transmission tower. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
34. Convolutional sparse coding‐based deep random vector functional link network for distress classification of road structures.
- Author
-
Maeda, Keisuke, Takahashi, Sho, Ogawa, Takahiro, and Haseyama, Miki
- Subjects
FEATURE extraction ,VISUAL training ,CLASSIFICATION ,ARTIFICIAL neural networks - Abstract
This paper presents a convolutional sparse coding (CSC)‐based deep random vector functional link network (CSDRN) for distress classification of road structures. The main contribution of this paper is the introduction of CSC into a feature extraction scheme in the distress classification. CSC can extract visual features representing characteristics of target images because it can successfully estimate optimal convolutional dictionary filters and sparse features as visual features by training from a small number of distress images. The optimal dictionaries trained from distress images have basic components of visual characteristics such as edge and line information of distress images. Furthermore, sparse feature maps estimated on the basis of the dictionaries represent both strength of the basic components and location information of regions having their components, and these maps can represent distress images. That is, sparse feature maps can extract key components from distress images that have diverse visual characteristics. Therefore, CSC‐based feature extraction is effective for training from a limited number of distress images that have diverse visual characteristics. The construction of a novel neural network, CSDRN, by the use of a combination of CSC‐based feature extraction and the DRN classifier, which can also be trained from a small dataset, is shown in this paper. Accurate distress classification is realized via the CSDRN. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
35. India: Intruder Node Detection and Isolation Action in Mobile Ad Hoc Networks Using Feature Optimization and Classification Approach.
- Author
-
Kavitha, T., Geetha, K., and Muthaiah, R.
- Subjects
WIRELESS communications equipment ,ALGORITHMS ,ARTIFICIAL intelligence ,CLUSTER analysis (Statistics) ,COMPUTER networks ,COMPUTERS ,INFORMATION storage & retrieval systems ,INFORMATION technology ,ARTIFICIAL neural networks ,CELL phones ,DATA security - Abstract
Due to lack of a central bureaucrat in mobile ad hoc networks, the security of the network becomes serious issue. During malicious attacks, according to the motivation of intruder the severity of the threat varies. It may lead to loss of data, energy or throughput. This paper proposes a lightweight Intruder Node Detection and Isolation Action mechanism (INDIA) using feature extraction, feature optimization and classification techniques. The indirect and direct trust features are extracted from each node and the total trust feature is computed by combining them. The trust features are extracted from each node of MANET and these features are optimized using Particle Swarm Optimization (PSO) algorithm as feature optimization technique. These optimized feature sets are then classified using Neural Networks (NN) classifier which identifies the intruder node. The performance of the proposed methodology is studied in terms of various parameters such as success rate in packet delivery, delay in communication and the amount of energy consumption for identifying and isolating the intruder. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
36. Scene Classification Using Hierarchical Wasserstein CNN.
- Author
-
Liu, Yishu, Suen, Ching Y., Liu, Yingbin, and Ding, Liwang
- Subjects
ARTIFICIAL neural networks ,ANALYTICAL solutions ,COST functions ,CLASSIFICATION ,REMOTE sensing - Abstract
In multiclass classification, convolutional neural network (CNN) is generally coupled with the cross-entropy (CE) loss, which only penalizes the predicted probability corresponding to a ground truth class and ignores the interclass relationship. We argue that CNN can be improved by using a better loss function. On the other hand, the Wasserstein distance (WD) is a well-known metric used to measure the distance between two distributions. Directly solving the WD problem requires a prohibitively large amount of computation time, whereas the cheaper iterative algorithms have a variety of shortcomings such as computational instability and difficulty in selecting parameters. In this paper, we address these issues by giving an analytical solution to the WD problem—for the first time, we find that for two distributions in hierarchically organized data space, WD has a closed-form solution, which we call “hierarchical WD (HWD).” We use this theory to construct novel loss functions that overcome the shortcomings of CE loss. To this end, multi-CNN information fusion that provides the basis for building category hierarchies is carried out first. Then, the semantic relationship among classes is modeled as a binary tree. Then, CNN coupled with an HWD-based loss, i.e., hierarchical Wasserstein CNN (HW-CNN), is trained to learn deep features. In this way, prior knowledge about the interclass relationship is embedded into HW-CNN, and information from several CNNs provides guidance in the process of training individual HW-CNNs. We conducted extensive experiments over two publicly available remote sensing data sets and achieved a state-of-the-art performance in scene classification tasks. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
37. On fusing the latent deep CNN feature for image classification.
- Author
-
Liu, Xueliang, Zhang, Rongjie, Meng, Zhijun, Hong, Richang, and Liu, Guangcan
- Subjects
CONVOLUTIONAL neural networks ,ARTIFICIAL neural networks ,EXTRACTION techniques ,SUPERVISED learning ,CLASSIFICATION ,FUSIFORM gyrus ,CLASSIFICATION algorithms ,FEATURE extraction - Abstract
Image classification, which aims at assigning a semantic category to images, has been extensively studied during the past few years. More recently, convolution neural network arises and has achieved very promising achievement. Compared with traditional feature extraction techniques (e.g., SIFT, HOG, GIST), the convolutional neural network can extract features from image automatically and does not need hand designed features. However, how to further improve the classification algorithm is still challenging in academic research. The latest research on CNN shows that the features extracted from middle layers is representative, which shows a possible way to improve the classification accuracy. Based on the observation, in this paper, we propose a method to fuse the latent features extracted from the middle layers in a CNN to train a more robust classifier. First, we utilize the pretrained CNN models to extract visual features from middle layer. Then, we use supervised learning method to train classifiers for each feature respectively. Finally, we use the late fusion strategy to combine the prediction of these classifiers. We evaluate the proposal with different classification methods under some several images benchmarks, and the results demonstrate that the proposed method can improve the performance effectively. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
38. Multiscaled Fusion of Deep Convolutional Neural Networks for Screening Atrial Fibrillation From Single Lead Short ECG Recordings.
- Author
-
Fan, Xiaomao, Yao, Qihang, Cai, Yunpeng, Miao, Fen, Sun, Fangmin, and Li, Ye
- Subjects
ATRIAL fibrillation ,ARRHYTHMIA ,OLDER people ,ARTIFICIAL neural networks ,ELECTROCARDIOGRAPHY ,STROKE patients - Abstract
Atrial fibrillation (AF) is one of the most common sustained chronic cardiac arrhythmia in elderly population, associated with a high mortality and morbidity in stroke, heart failure, coronary artery disease, systemic thromboembolism, etc. The early detection of AF is necessary for averting the possibility of disability or mortality. However, AF detection remains problematic due to its episodic pattern. In this paper, a multiscaled fusion of deep convolutional neural network (MS-CNN) is proposed to screen out AF recordings from single lead short electrocardiogram (ECG) recordings. The MS-CNN employs the architecture of two-stream convolutional networks with different filter sizes to capture features of different scales. The experimental results show that the proposed MS-CNN achieves 96.99% of classification accuracy on ECG recordings cropped/padded to 5 s. Especially, the best classification accuracy, 98.13%, is obtained on ECG recordings of 20 s. Compared with artificial neural network, shallow single-stream CNN, and VisualGeometry group network, the MS-CNN can achieve the better classification performance. Meanwhile, visualization of the learned features from the MS-CNN demonstrates its superiority in extracting linear separable ECG features without hand-craft feature engineering. The excellent AF screening performance of the MS-CNN can satisfy the most elders for daily monitoring with wearable devices. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
39. Using a Convolutional Neural Network for Machine Written Character Recognition.
- Author
-
Karrach, Ladislav and Pivarčiová, Elena
- Subjects
- *
CONVOLUTIONAL neural networks , *ARTIFICIAL neural networks , *COMPUTER vision , *IMAGE recognition (Computer vision) , *OBJECT recognition (Computer vision) , *PATTERN recognition systems - Abstract
Convolutional neural networks are special types of artificial neural networks that can solve various tasks in computer vision, such as image classification, object detection, and general recognition. The paper presents the basic building blocks of convolutional neural networks and their architecture, and compares their recognition accuracy with other character recognition techniques using the example of character recognition from vehicle registration plates. The purpose of the experiments was to determine the optimal configuration of the convolutional neural network and the influence of the size and design method of the training set on the recognition rate. The study shows that although convolutional neural networks have recently gained attention, traditional recognition methods are still relevant, and the choice of the right classifier and its configuration depends on the type of recognition task. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
40. Ensemble classification for intrusion detection via feature extraction based on deep Learning.
- Author
-
Yousefnezhad, Maryam, Hamidzadeh, Javad, and Aliannejadi, Mohammad
- Subjects
DEEP learning ,FEATURE extraction ,ARTIFICIAL neural networks ,SUPPORT vector machines ,ALGORITHMS ,DECISION trees - Abstract
An intrusion detection system is a security system that aims to detect sabotage and intrusions on networks to inform experts of the attack and abuse of the network. Different classification methods have been used in the intrusion detection systems such as fuzzy, genetic algorithms, decision trees, artificial neural networks, and support vector machines. Moreover, ensemble classifiers have shown more robust and effective performance for various tasks in the field. In this paper, we adopt ensemble models in order to improve the performance of intrusion detection and, at the same time, decrease the false alarm rate. We use kNN for multi-class classification, as well as SVM to approach the classification problem in normal-based detection. In order to combine multiple outputs, we use the Dempster–Shafer method in which there is the possibility of explicit retrieval of uncertainty. Moreover, we utilize deep learning for extracting features to train the samples, selected by the sample selection algorithm based on ensemble margin. We compare our results with state-of-the-art methods on benchmarking datasets such as UNSW-NB15, CICIDS2017, and NSL-KDD. Our proposed method indicates the superiority in terms of prominent metrics Accuracy, Precision, Recall, and F-measure. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
41. Interpretable CNNs for Object Classification.
- Author
-
Zhang, Quanshi, Wang, Xin, Wu, Ying Nian, Zhou, Huilin, and Zhu, Song-Chun
- Subjects
KNOWLEDGE representation (Information theory) ,CONVOLUTIONAL neural networks ,DEEP learning ,CLASSIFICATION ,ARTIFICIAL neural networks - Abstract
This paper proposes a generic method to learn interpretable convolutional filters in a deep convolutional neural network (CNN) for object classification, where each interpretable filter encodes features of a specific object part. Our method does not require additional annotations of object parts or textures for supervision. Instead, we use the same training data as traditional CNNs. Our method automatically assigns each interpretable filter in a high conv-layer with an object part of a certain category during the learning process. Such explicit knowledge representations in conv-layers of the CNN help people clarify the logic encoded in the CNN, i.e., answering what patterns the CNN extracts from an input image and uses for prediction. We have tested our method using different benchmark CNNs with various architectures to demonstrate the broad applicability of our method. Experiments have shown that our interpretable filters are much more semantically meaningful than traditional filters. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
42. Classification of Mixed-Type Defect Patterns in Wafer Bin Maps Using Convolutional Neural Networks.
- Author
-
Kyeong, Kiryong and Kim, Heeyoung
- Subjects
SEMICONDUCTOR wafers testing ,ARTIFICIAL neural networks ,POINT defects ,COMPUTER simulation ,RANDOM noise theory - Abstract
In semiconductor manufacturing, a wafer bin map (WBM) represents the results of wafer testing for dies using a binary pass or fail value. For WBMs, defective dies are often clustered into groups of local systematic defects. Determining their specific patterns is important, because different patterns are related to different root causes of failure. Recently, because wafer sizes have increased and the process technology has become more complicated, the probability of observing mixed-type defect patterns, i.e., two or more defect patterns in a single wafer, has increased. In this paper, we propose the use of convolutional neural networks (CNNs) to classify mixed-type defect patterns in WBMs in the framework of an individual classification model for each defect pattern. Through simulated and real data examples, we show that the CNN is robust to random noise and performs effectively, even if there are many random defects in WBMs. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
43. Supervised Deep Feature Extraction for Hyperspectral Image Classification.
- Author
-
Liu, Bing, Yu, Xuchu, Zhang, Pengqiang, Yu, Anzhu, Fu, Qiongying, and Wei, Xiangpo
- Subjects
HYPERSPECTRAL imaging systems ,SIGNAL convolution ,ARTIFICIAL neural networks ,SUPPORT vector machines ,CLASSIFICATION - Abstract
Hyperspectral image classification has become a research focus in recent literature. However, well-designed features are still open issues that impact on the performance of classifiers. In this paper, a novel supervised deep feature extraction method based on siamese convolutional neural network (S-CNN) is proposed to improve the performance of hyperspectral image classification. First, a CNN with five layers is designed to directly extract deep features from hyperspectral cube, where the CNN can be intended as a nonlinear transformation function. Then, the siamese network composed by two CNNs is trained to learn features that show a low intraclass and high interclass variability. The important characteristic of the presented approach is that the S-CNN is supervised with a margin ranking loss function, which can extract more discriminative features for classification tasks. To demonstrate the effectiveness of the proposed feature extraction method, the features extracted from three widely used hyperspectral data sets are fed into a linear support vector machine (SVM) classifier. The experimental results demonstrate that the proposed feature extraction method in conjunction with a linear SVM classifier can obtain better classification performance than that of the conventional methods. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
44. Object Detection Based on Fast/Faster RCNN Employing Fully Convolutional Architectures.
- Author
-
Ren, Yun, Zhu, Changren, and Xiao, Shunping
- Subjects
FEATURE extraction ,PHYSICS experiments ,CLASSIFICATION ,DATA protection ,ARTIFICIAL neural networks - Abstract
Modern object detectors always include two major parts: a feature extractor and a feature classifier as same as traditional object detectors. The deeper and wider convolutional architectures are adopted as the feature extractor at present. However, many notable object detection systems such as Fast/Faster RCNN only consider simple fully connected layers as the feature classifier. In this paper, we declare that it is beneficial for the detection performance to elaboratively design deep convolutional networks (ConvNets) of various depths for feature classification, especially using the fully convolutional architectures. In addition, this paper also demonstrates how to employ the fully convolutional architectures in the Fast/Faster RCNN. Experimental results show that a classifier based on convolutional layer is more effective for object detection than that based on fully connected layer and that the better detection performance can be achieved by employing deeper ConvNets as the feature classifier. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
45. A survey of neural network based automated systems for human chromosome classification.
- Author
-
Abid, Faroudja and Hamami, Latifa
- Subjects
ARTIFICIAL neural networks ,HUMAN chromosome abnormalities ,HUMAN cytogenetics ,FEATURE extraction ,FEATURE selection - Abstract
Chromosome classification and karyotype establishment are important procedures for genetic diseases diagnosis. Various computer-aided systems have been developed to automate this tedious and time consuming task, which is performed manually in most cytogenetic laboratories. This paper provides a comprehensive review of past and recent research in the area of automatic chromosome classification systems. We start by reviewing methods for feature extraction, followed by a neural network based chromosome classifiers survey. We sum-up various techniques and methods in this area of research and discuss important issues and outcomes within each study for both chromosome feature extraction and classification. Although the ANN based chromosome classifiers are the main topic of this survey, a number of classifiers based on other algorithms are exposed to give an overall idea about additional techniques employed in chromosome classification. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
46. Learning Subspace-Based RBFNN Using Coevolutionary Algorithm for Complex Classification Tasks.
- Author
-
Tian, Jin, Li, Minqiang, Chen, Fuzan, and Feng, Nan
- Subjects
MACHINE learning ,SUBSPACES (Mathematics) ,ARTIFICIAL neural networks ,EVOLUTIONARY algorithms ,FEATURE extraction - Abstract
Many real-world classification problems are characterized by samples of a complex distribution in the input space. The classification accuracy is determined by intrinsic properties of all samples in subspaces of features. This paper proposes a novel algorithm for the construction of radial basis function neural network (RBFNN) classifier based on subspace learning. In this paper, feature subspaces are obtained for every hidden node of the RBFNN during the learning process. The connection weights between the input layer and the hidden layer are adjusted to produce various subspaces with dominative features for different hidden nodes. The network structure and dominative features are encoded in two subpopulations that are cooperatively coevolved using the coevolutionary algorithm to achieve a better global optimality for the estimated RBFNN. Experimental results illustrate that the proposed algorithm is able to obtain RBFNN models with both better classification accuracy and simpler network structure when compared with other learning algorithms. Thus, the proposed model provides a more flexible and efficient approach to complex classification tasks by employing the local characteristics of samples in subspaces. [ABSTRACT FROM PUBLISHER]
- Published
- 2016
- Full Text
- View/download PDF
47. Eye Detection-Based Deep Belief Neural Networks and Speeded-Up Robust Feature Algorithm.
- Author
-
Tarek, Zahraa, Shohieb, Samaa M., Elhady, Abdelghafar M., El-kenawy, El-Sayed M., and Shams, Mahmoud Y.
- Subjects
RETINAL imaging ,FACIAL expression ,BIOMETRIC eye scanning systems ,BIOMETRIC identification ,ARTIFICIAL neural networks - Abstract
The ability to detect and localize the human eye is critical for use in security applications and human identification and verification systems. This is because eye recognition algorithms have multiple challenges, such as multi-pose variations, ocular parts, and illumination. Moreover, the modern security applications fail to detect facial expressions from eye images. In this paper, a Speeded-Up Roust Feature (SURF) Algorithm was utilized to localize the face images of the enrolled subjects. We highlighted on eye and pupil parts to be detected based on SURF, Hough Circle Transform (HCT), and Local Binary Pattern (LBP). Afterward, Deep Belief Neural Networks (DBNN) were used to classify the input features results from the SURF algorithm. We further determined the correctly and wrongly classified subjects using a confusion matrix with two class labels to classify people whose eye images are correctly detected. We apply Stochastic Gradient Descent (SGD) optimizer to address the overfitting problem, and the hyper-parameters are fine-tuned based on the applied DBNN. The accuracy of the proposed system is determined based on SURF, LBP, and DBNN classifier achieved 95.54% for the ORL dataset, 94.07% for the BioID, and 96.20% for the CASIA-V5 dataset. The proposed approach is more reliable and more advanced when compared with state-of-the-art algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
48. Fall Detection With UWB Radars and CNN-LSTM Architecture.
- Author
-
Maitre, Julien, Bouchard, Kevin, and Gaboury, Sebastien
- Subjects
ULTRA-wideband radar ,ARTIFICIAL neural networks ,HIP fractures ,CONVOLUTIONAL neural networks ,NECK injuries ,FEMUR neck ,BIOLOGICAL dressings - Abstract
Fall detection is a major challenge for researchers. Indeed, a fall can cause injuries such as femoral neck fracture, brain hemorrhage, or skin burns, leading to significant pain. However, in some cases, trauma caused by an undetected fall can get worse with the time and conducts to painful end of life or even death. One solution is to detect falls efficiently to alert somebody (e.g., nurses) as quickly as possible. To respond to this need, we propose to detect falls in a real apartment of 40 square meters by exploiting three ultra-wideband radars and a deep neural network model. The deep neural network is composed of a convolutional neural network stacked with a long-short term memory network and a fully connected neural network to identify falls. In other words, the problem addressed in this paper is a binary classification attempting to differentiate fall and non-fall events. As it can be noticed in real cases, the falls can have different forms. Hence, the data to train and test the classification model have been generated with falls (four types) simulated by 10 participants in three locations in the apartment. Finally, the train and test stages have been achieved according to three strategies, including the leave-one-subject-out method. This latter method allows for obtaining the performances of the proposed system in a generalization context. The results are very promising since we reach almost 90% of accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
49. A Survey on Artificial Intelligence in Chinese Sign Language Recognition.
- Author
-
Jiang, Xianwei, Satapathy, Suresh Chandra, Yang, Longxiang, Wang, Shui-Hua, and Zhang, Yu-Dong
- Subjects
SIGN language ,ARTIFICIAL intelligence ,CONVOLUTIONAL neural networks ,CHINESE language ,ARTIFICIAL neural networks ,FEATURE extraction - Abstract
Chinese Sign Language (CSL) offers the main means of communication for the hearing impaired in China. Sign Language Recognition (SLR) can shorten the distance between the hearing-impaired and healthy people and help them integrate into the society. Therefore, SLR has become the focus of sign language application research. Over the years, the continuous development of new technologies provides a source and motivation for SLR. This paper aims to cover the most recent approaches in Chinese Sign Language Recognition (CSLR). With a thorough review of superior methods from 2000 to 2019 in CSLR researches, various techniques and algorithms such as scale-invariant feature transform, histogram of oriented gradients, wavelet entropy, Hu moment invariant, Fourier descriptor, gray-level co-occurrence matrix, dynamic time warping, principal component analysis, autoencoder, hidden Markov model (HMM), support vector machine (SVM), random forest, skin color modeling method, k-NN, artificial neural network, convolutional neural network (CNN), and transfer learning are discussed in detail, which are based on several major stages, that is, data acquisition, preprocessing, feature extraction, and classification. CSLR was summarized from some aspect as follows: methods of classification and feature extraction, accuracy/performance evaluation, and sample size/datasets. The advantages and limitations of different CSLR approaches were compared. It was found that data acquisition is mainly through Kinect and camera, and the feature extraction focuses on hand's shape and spatiotemporal factors, but ignoring facial expressions. HMM and SVM are used most in the classification. CNN is becoming more and more popular, and a deep neural network-based recognition approach will be the future trend. However, due to the complexity of the contemporary Chinese language, CSLR generally has a lower accuracy than other SLR. It is necessary to establish an appropriate dataset to conduct comparable experiments. The issue of decreasing accuracy as the dataset increases needs to resolve. Overall, our study is hoped to give a comprehensive presentation for those people who are interested in CSLR and SLR and to further contribute to the future research. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
50. Data augmentation for handwritten digit recognition using generative adversarial networks.
- Author
-
Jha, Ganesh and Cecotti, Hubert
- Subjects
HANDWRITING recognition (Computer science) ,ARTIFICIAL neural networks ,COMPUTER vision ,SUPERVISED learning ,DEEP learning ,FEATURE extraction - Abstract
Supervised learning techniques require labeled examples that can be time consuming to obtain. In particular, deep learning approaches, where all the feature extraction stages are learned within the artificial neural network, require a large number of labeled examples to train the model. Various data augmentation techniques can be performed to overcome this issue by taking advantage of known variations that have no impact on the label of an example. Typical solutions in computer vision and document analysis and recognition are based on geometric transformations (e.g. shift and rotation) and random elastic deformations of the original training examples. In this paper, we consider Generative Adversarial Networks (GAN), a technique that does not require prior knowledge of the possible variabilities that exist across examples to create novel artificial examples. In the case of a training dataset with a low number of labeled examples, which are described in a high dimensional space, the classifier may generalize poorly. Therefore, we aim at enriching databases of images or signals for improving the classifier performance by designing a GAN for creating artificial images. While adding more images through a GAN can help, the extent to which it will help is unknown, and it may degrade the performance if too many artificial images are added. The approach is tested on four datasets on handwritten digits (Latin, Bangla, Devanagri, and Oriya). The accuracy for each dataset shows that the addition of GAN generated images in the training dataset provides an improvement of the accuracy. However, the results suggest that the addition of too many GAN generated images deteriorates the performance. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.