756 results
Search Results
2. Enhancing Fine-Grained Image Recognition with Multi-Channel Self-Attention Mechanisms: A Focus on Fruit Fly Species Classification.
- Author
-
Lu, Yu, Yi, Ke, and Xu, Yilu
- Subjects
FRUIT flies ,IMAGE recognition (Computer vision) ,DEEP learning ,FEATURE extraction ,SHORT-term memory ,LONG-term memory ,CLASSIFICATION - Abstract
Fruit fly species classification is a fine-grained task as there is a small gap between species. In order to effectively identify and improve the recognition of fruit flies, a fine-grained image-recognition method based on a multi-channel self-attention mechanism was studied and a network framework for fine-grained image recognition based on deep learning was designed in this paper. In this framework, long-term and short-term memory networks are used to extract the underlying features in fruit fly fine-grained images. By inputting the underlying features in the multi-channel self-attention mechanism module, the global and local attention feature maps can be obtained.The weighted attention feature map can also be obtained by multiplying the weight of each channel and the attention feature map. The fine-grained image features of fruit flies were obtained by summing the weighted attention feature map. A softmax classifier was used to process the features and complete the recognition of the fruit fly fine-grained images. Two fine-grained image datasets of fruit flies were applied as experimental objects. Dataset 1 and Dataset 2 contain 11,778 images and 20,580 images from 46 different categories of fruit flies, respectively. The Kappa coefficient was used as the evaluation index to identify fruit fly images with different targets using the method proposed herein. The experimental results showed that, as the number of attention channels increased, the Kappa coefficient gradually increased, suggesting an improvement in the accuracy of fine-grained image recognition. The fine-grained image features extracted by introducing a multi-channel self-attention mechanism exhibited more distinct boundaries with a small amount of overlap, demonstrating strong feature extraction capabilities. When dealing with fine-grained images with either simple or complex backgrounds, the method proposed in this paper has good performance and generalization ability. Even if the target is small and varied in shape, it can still achieve highly accurate recognition. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. The Expansion Methods of Inception and Its Application.
- Author
-
Shi, Cuiping, Liu, Zhenquan, Qu, Jiageng, and Deng, Yuxin
- Subjects
DEEP learning ,CONVOLUTIONAL neural networks ,FEATURE extraction - Abstract
In recent years, with the rapid development of deep learning technology, a large number of excellent convolutional neural networks (CNNs) have been proposed, many of which are based on improvements to classical methods. Based on the Inception family of methods, depthwise separable convolution was applied to Xception to achieve lightweighting, and Inception-ResNet introduces residual connections to accelerate model convergence. However, existing improvements for the Inception module often neglect further enhancement of its receptive field, while increasing the receptive field of CNNs has been widely studied and proven to be effective in improving classification performance. Motivated by this fact, three effective expansion modules are proposed in this paper. The first expansion module, Inception expand (Inception-e) module, is proposed to improve the classification accuracy by concatenating more and deeper convolutional branches. To reduce the number of parameters for Inception e, this paper proposes a second expansion module—Equivalent Inception-e (Eception) module, which is equivalent to Inception-e in terms of feature extraction capability, but which suppresses the growth of the parameter quantity brought by the expansion by effectively reducing the redundant convolutional layers; on the basis of Eception, this paper proposes a third expansion module—Lightweight Eception (Lception) module, which crosses depthwise convolution with ordinary convolution to further effectively reduce the number of parameters. The three proposed modules have been validated on the Cifar10 dataset. The experimental results show that all these extensions are effective in improving the classification accuracy of the models, and the most significant effect is the Lception module, where Lception (rank = 4) on the Cifar10 dataset improves the accuracy by 1.5% compared to the baseline model (Inception module A) by using only 0.15 M more parameters. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. An appearance quality classification method for Auricularia auricula based on deep learning.
- Author
-
Li, Yang, Hu, Jiajun, Wu, Haiyun, Wei, Yong, Shan, Huiyong, Song, Xin, Hua, Xiuping, Xu, Wei, and Jiang, Yongcheng
- Subjects
DEEP learning ,PRIMROSES ,CONVOLUTIONAL neural networks ,DATA mining ,FEATURE extraction ,CLASSIFICATION - Abstract
The intelligent appearance quality classification method for Auricularia auricula is of great significance to promote this industry. This paper proposes an appearance quality classification method for Auricularia auricula based on the improved Faster Region-based Convolutional Neural Networks (improved Faster RCNN) framework. The original Faster RCNN is improved by establishing a multiscale feature fusion detection model to improve the accuracy and real-time performance of the model. The multiscale feature fusion detection model makes full use of shallow feature information to complete target detection. It fuses shallow features with rich detailed information with deep features rich in strong semantic information. Since the fusion algorithm directly uses the existing information of the feature extraction network, there is no additional calculation. The fused features contain more original detailed feature information. Therefore, the improved Faster RCNN can improve the final detection rate without sacrificing speed. By comparing with the original Faster RCNN model, the mean average precision (mAP) of the improved Faster RCNN is increased by 2.13%. The average precision (AP) of the first-level Auricularia auricula is almost unchanged at a high level. The AP of the second-level Auricularia auricula is increased by nearly 5%. And the third-level Auricularia auricula AP is increased by 1%. The improved Faster RCNN improves the frames per second from 6.81 of the original Faster RCNN to 13.5. Meanwhile, the influence of complex environment and image resolution on the Auricularia auricula detection is explored. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. MpoxNet: dual-branch deep residual squeeze and excitation monkeypox classification network with attention mechanism.
- Author
-
Jingbo Sun, Baoxi Yuan, Zhaocheng Sun, Jiajun Zhu, Yuxin Deng, Yi Gong, and Yuhe Chen
- Subjects
MONKEYPOX ,COVID-19 pandemic ,NOSOLOGY ,CLASSIFICATION ,FEATURE extraction - Abstract
While the world struggles to recover from the devastation wrought by the widespread spread of COVID-19, monkeypox virus has emerged as a new global pandemic threat. In this paper, a high precision and lightweight classification network MpoxNet based on ConvNext is proposed to meet the need of fast and safe detection of monkeypox classification. In this method, a two-branch depthseparable convolution residual Squeeze and Excitation module is designed. This design aims to extract more feature information with two branches, and greatly reduces the number of parameters in the model by using depth-separable convolution. In addition, our method introduces a convolutional attention module to enhance the extraction of key features within the receptive field. The experimental results show that MpoxNet has achieved remarkable results in monkeypox disease classification, the accuracy rate is 95.28%, the precision rate is 96.40%, the recall rate is 93.00%, and the F1-Score is 95.80%. This is significantly better than the current mainstream classification model. It is worth noting that the FLOPS and the number of parameters of MpoxNet are only 30.68% and 31.87% of those of ConvNext-Tiny, indicating that the model has a small computational burden and model complexity while efficient performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. IFF-Net: Irregular Feature Fusion Network for Multimodal Remote Sensing Image Classification.
- Author
-
Wang, Huiqing, Wang, Huajun, and Wu, Linfeng
- Subjects
IMAGE recognition (Computer vision) ,REMOTE sensing ,DEEP learning ,MULTIMODAL user interfaces ,SURFACE of the earth ,FEATURE extraction ,JUDGMENT (Psychology) - Abstract
In recent years, classification and identification of Earth's surface materials has been a challenging research topic in the field of earth science and remote sensing (RS). Although deep learning techniques have achieved some results in remote sensing image classification, there are still some challenges for multimodal remote sensing data classification, such as information redundancy between multimodal remote sensing images. In this paper, we propose a multimodal remote sensing data classification method IFF-Net based on irregular feature fusion, called IFF-Net. The IFF-Net architecture utilizes weight-shared residual blocks for feature extraction while maintaining the independent batch normalization (BN) layer. During the training phase, the redundancy of the current channel is determined by evaluating the judgement factor of the BN layer. If this judgment factor falls below a predefined threshold, it indicates that the current channel information is redundant and should be substituted with another channel. Sparse constraints are imposed on some of the judgment factors in order to remove extra channels and enhance generalization. Furthermore, a module for feature normalization and calibration has been devised to leverage the spatial interdependence of multimodal features in order to achieve improved discrimination. Two standard datasets are used in the experiments to validate the effectiveness of the proposed method. The experimental results show that the IFF-NET method proposed in this paper exhibits significantly superior performance compared to the state-of-the-art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. TEXT CLASSIFICATION AND CLUSTER ANALYSIS BASED ON DEEP LEARNING AND NATURAL LANGUAGE PROCESSING.
- Author
-
HUA HUANG
- Subjects
DEEP learning ,CLUSTER analysis (Statistics) ,NATURAL language processing ,CLASSIFICATION algorithms ,FEATURE extraction ,CLASSIFICATION - Abstract
At present, the commonly used Bag of Words (BOW) expression ignores the semantic information of text and the problems of high dimension and high sparsity of feature extraction. This paper presents a multi-class text representation and classification algorithm. This project is based on the vector expression of keywords and takes the multi-category classification problem as the research object. Then, a hybrid Deep Location network (HDBN) is constructed by combining DBN with Boltzmann (DBM). Then, this paper does a lot of tests on the algorithm and proves the effectiveness of the algorithm. In addition, the 2D visual experiment is carried out with HDBN, and then the high-level text expression based on HDBN is obtained. The expression has strong cohesion and weak coupling. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. Deep learning models for human age prediction to prevent, treat and extend life expectancy: DCPV taxonomy.
- Author
-
Alsadoon, Abeer, Al-Naymat, Ghazi, and Islam, Md Rafiqul
- Abstract
The implementation of Deep Learning (DL) Prediction techniques for Human Age Prediction (HAP) has been widely researched and studied to prevent, treat, and extend life expectancy. While most algorithms rely on facial images, MRI scans, and DNA methylation for training and testing, they are seldom implemented due to a lack of significant validation and evaluation in real-world scenarios, low performance, and technical challenges. To address these issues, this paper proposes the Data, Classification Technique, Prediction, and View (DCPV) taxonomy, which outlines the primary components required to implement and validate a deep learning model for predicting human age. By providing a common baseline for end-users and researchers, this taxonomy offers a clearer view of the constituents of deep learning prediction approaches, enabling the development of similar systems in the health domain. In contrast to existing machine learning methods, the proposed taxonomy emphasizes the value of deep learning practices based on performance, accuracy, and efficiency in predicting human age. To validate the DCPV taxonomy, the study examines 31 state-of-the-art research journal articles within the HAP system domain, assessing the taxonomy's performance, accuracy, robustness, and model comparisons. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. Sarcasm Detection: A Review, Synthesis and Future Research Agenda.
- Author
-
Sahu, Geeta Abakash and Hudnurkar, Manoj
- Subjects
DEEP learning ,SARCASM ,LITERATURE reviews ,FEATURE extraction ,MACHINE learning ,RESEARCH personnel - Abstract
A literature review on sarcasm detection has been undergone in this research work. To have a meaningful study about the existing works on sarcasm detection, a total of 65 research papers have been analyzed in diverse aspects like the datasets utilized, language, pre-processing technique, type of features, feature extraction technique, machine learning/deep learning-based sarcasm classification. All these papers belong to diverse international as well as national journals. Moreover, the performance of each work in terms of accuracy, F-score and recall will also be manifested. To show the superiority of the works, a comparative evaluation has been undergone in terms of analyzed performances of each of the works. Finally, the works that hold the superior or improved values are furnished. In addition, the current challenges faced by the sarcasm detection system are portrayed, and this will be a milestone for future researchers. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
10. A review of research on micro-expression recognition algorithms based on deep learning
- Author
-
Zhang, Fan and Chai, Lin
- Published
- 2024
- Full Text
- View/download PDF
11. Few-Shot Scene Classification with Attention Mechanism in Remote Sensing.
- Author
-
ZHANG Duona, ZHAO Hongjia, LU Yuanyao, CUI Jian, and ZHANG Baochang
- Subjects
DEEP learning ,COMPUTER vision ,FEATURE extraction ,VISUAL fields ,CLASSIFICATION ,REMOTE sensing - Abstract
Remote sensing scene classification is a hot research topic in the field of computer vision, and it is of great significance to semantic understanding of remote sensing images. At present, remote sensing scene classification methods based on deep learning occupy a dominant position in this field. However, it suffers from the lack of samples and poor model generalization ability in actual application scenarios. Therefore, this paper proposes a few-shot remote scene classification method based on attention mechanism, and designs a structure of dual- branches similarity measurement. This method is based on the meta-learning training strategy to divide the dataset into tasks. At the meantime, the input images are divided into blocks in order to preserve the feature distribution in the remote sensing image. Then the lightweight attention module is introduced into the feature extraction network to reduce the risk of overfitting and ensure the acquisition of discriminative features. Finally, based on earth mover's distance (EMD), a dual-branches similarity measurement module is added to improve the discriminative ability of the classifier. The results show that compared with the classic smallsample learning method, the few-shot remote scene classification method proposed in this paper can significantly improve the classification performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. Power quality recognition in noisy environment employing deep feature extraction from cross stockwell spectrum time–frequency images.
- Author
-
Chakraborty, Ananya, Chatterjee, Soumya, and Mandal, Ratan
- Subjects
POWER distribution networks ,DEEP learning ,MACHINE learning ,CONVOLUTIONAL neural networks ,FALSE discovery rate ,ONE-way analysis of variance ,FEATURE extraction - Abstract
Automated and accurate detection of power quality (PQ) events is important from the point of view of safety as well as maintaining the reliability of the power transmission and distribution network. However, detection of multiple PQ events in a noisy environment is challenging task. Another important issue is the choice of meaningful features that can directly influence the accuracy of PQ detection. Considering these two aforesaid facts, this paper presents a novel framework for automated classification of PQ signals in a noisy environment employing cross Stockwell Transform (XST). The XST proposed in this paper has better noise suppression capability compared to conventional Stockwell Transform. Here, XST was used to convert 1D PQ signals to 2D time–frequency (T–F) images. To improve the accuracy of PQ detection, an automated feature extraction method employing deep learning is implemented in this work. The noise free T–F images obtained using XST were fed as inputs to several pre-trained convolutional neural networks (CNNs) for deep feature extraction. Transfer learning technique was implemented to reduce the computational cost. The extracted deep features were further undergone selection using one-way analysis of variance test followed by false discovery rate correction. The statistically significant deep features were subsequently fed to three benchmark machine learning classifiers for classification of PQ signals. In addition, tests were also carried out on real-life PQ signals to verify the practicability of the proposed framework. Investigations revealed that the proposed method returned mean accuracy of 99.72% and 96.45% for classification of simulated and real-life PQ signals, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
13. AI-based diagnosis of COVID-19 patients using X-ray scans with stochastic ensemble of CNNs
- Author
-
Balasubramanian Raman, Vinodh J Sahayasheela, Himanshu Buckchash, Vipul Bansal, Narayanan Narayanan, Rahul Kumar, Ridhi Arora, and Ganesh N. Pandian
- Subjects
Coronavirus disease 2019 (COVID-19) ,Computer science ,Gaussian ,Feature vector ,Feature extraction ,Biomedical Engineering ,Biophysics ,Image processing ,Scientific Paper ,World health ,X-ray ,symbols.namesake ,Machine learning ,Humans ,Radiology, Nuclear Medicine and imaging ,Instrumentation ,Radiological and Ultrasound Technology ,SARS-CoV-2 ,business.industry ,X-Rays ,Deep learning ,COVID-19 ,Pattern recognition ,Classification ,symbols ,Neural Networks, Computer ,Artificial intelligence ,business ,Algorithms ,Latent vector ,Biotechnology - Abstract
According to the World Health Organization (WHO), novel coronavirus (COVID-19) is an infectious disease and has a significant social and economic impact. The main challenge in fighting against this disease is its scale. Due to the outbreak, medical facilities are under pressure due to case numbers. A quick diagnosis system is required to address these challenges. To this end, a stochastic deep learning model is proposed. The main idea is to constrain the deep-representations over a Gaussian prior to reinforce the discriminability in feature space. The model can work on chest X-ray or CT-scan images. It provides a fast diagnosis of COVID-19 and can scale seamlessly. The work presents a comprehensive evaluation of previously proposed approaches for X-ray based disease diagnosis. The approach works by learning a latent space over X-ray image distribution from the ensemble of state-of-the-art convolutional-nets, and then linearly regressing the predictions from an ensemble of classifiers which take the latent vector as input. We experimented with publicly available datasets having three classes: COVID-19, normal and pneumonia yielding an overall accuracy and AUC of 0.91 and 0.97, respectively. Moreover, for robust evaluation, experiments were performed on a large chest X-ray dataset to classify among Atelectasis, Effusion, Infiltration, Nodule, and Pneumonia classes. The results demonstrate that the proposed model has better understanding of the X-ray images which make the network more generic to be later used with other domains of medical image analysis.
- Published
- 2021
14. An efficient deep learning model for classification of thermal face images.
- Author
-
Abd El-Rahiem, Basma, Sedik, Ahmed, El Banby, Ghada M., Ibrahem, Hani M., Amin, Mohamed, Song, Oh-Young, Khalaf, Ashraf A. M., and Abd El-Samie, Fathi E.
- Subjects
DEEP learning ,THERMOGRAPHY ,CONVOLUTIONAL neural networks ,FEATURE extraction ,DATABASES - Abstract
Purpose: The objective of this paper is to perform infrared (IR) face recognition efficiently with convolutional neural networks (CNNs). The proposed model in this paper has several advantages such as the automatic feature extraction using convolutional and pooling layers and the ability to distinguish between faces without visual details. Design/methodology/approach: A model which comprises five convolutional layers in addition to five max-pooling layers is introduced for the recognition of IR faces. Findings: The experimental results and analysis reveal high recognition rates of IR faces with the proposed model. Originality/value: A designed CNN model is presented for IR face recognition. Both the feature extraction and classification tasks are incorporated into this model. The problems of low contrast and absence of details in IR images are overcome with the proposed model. The recognition accuracy reaches 100% in experiments on the Terravic Facial IR Database (TFIRDB). [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
15. Deep learning for multi-grade brain tumor detection and classification: a prospective survey
- Author
-
Bhagyalaxmi, K., Dwarakanath, B., and Reddy, P. Vijaya Pal
- Published
- 2024
- Full Text
- View/download PDF
16. Feature-Based Deep Learning Classification for Pipeline Component Extraction from 3D Point Clouds.
- Author
-
Xu, Zhao, Kang, Rui, and Li, Heng
- Subjects
DEEP learning ,POINT cloud ,FEATURE extraction ,CLASSIFICATION - Abstract
This paper proposes a novel method for construction component classification by designing a feature-based deep learning network to tackle the automation problem in construction digitization. Although scholars have proposed a variety of ways to achieve the use of deep learning to classify point clouds, there are few practical engineering applications in the construction industry. However, in the process of building digitization, the level of manual participation has significantly reduced the efficiency of digitization and increased the application restrictions. To address this problem, we propose a robust classification method using deep learning networks, which is combined with traditional shape features for the point cloud of construction components. The proposed method starts with local and global feature extraction, where global features processed by the neural network and the traditional shape features are processed separately. Then, we generate a feature map and perform deep convolution to achieve feature fusion. Finally, experiments are designed to prove the efficiency of the proposed method based on the construction dataset we establish. This paper fills in the lack of deep learning applications of point clouds in construction component classification. Additionally, this paper provides a feasible solution to improve the construction digitization efficiency and provides an available dataset for future work. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
17. Quantifying Soybean Defects: A Computational Approach to Seed Classification Using Deep Learning Techniques.
- Author
-
Sable, Amar, Singh, Parminder, Kaur, Avinash, Driss, Maha, and Boulila, Wadii
- Subjects
DEEP learning ,CONVOLUTIONAL neural networks ,SOYBEAN ,SOYBEAN industry ,SEEDS ,AGRICULTURE - Abstract
This paper presents a computational approach for quantifying soybean defects through seed classification using deep learning techniques. To differentiate between good and defective soybean seeds quickly and accurately, we introduce a lightweight soybean seed defect identification network (SSDINet). Initially, the labeled soybean seed dataset is developed and processed through the proposed seed contour detection (SCD) algorithm, which enhances the quality of soybean seed images and performs segmentation, followed by SSDINet. The classification network, SSDINet, consists of a convolutional neural network, depthwise convolution blocks, and squeeze-and-excitation blocks, making the network lightweight, faster, and more accurate than other state-of-the-art approaches. Experimental results demonstrate that SSDINet achieved the highest accuracy, of 98.64%, with 1.15 M parameters in 4.70 ms, surpassing existing state-of-the-art models. This research contributes to advancing deep learning techniques in agricultural applications and offers insights into the practical implementation of seed classification systems for quality control in the soybean industry. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. WBC-CNN: Efficient CNN-Based Models to Classify White Blood Cells Subtypes.
- Author
-
Alofi, Najla, Alonezi, Wafa, and Alawad, Wedad
- Subjects
LEUCOCYTES ,SUPPORT vector machines ,FEATURE extraction ,BLOOD cell count ,BLOOD cells ,DEEP learning - Abstract
Blood is essential to life. The number of blood cells plays a significant role in observing an individual's health status. Having a lower or higher number of blood cells than normal may be a sign of various diseases. Thus it is important to precisely classify blood cells and count them to diagnose different health conditions. In this paper, we focused on classifying white blood cells subtypes (WBC) which are the basic parts of the immune system. Classification of WBC subtypes is very useful for diagnosing diseases, infections, and disorders. Deep learning technologies have the potential to enhance the process and results of WBC classification. This study presented two fine-tuned CNN models and four hybrid CNN-based models to classify WBC. The VGG-16 and MobileNet are the CNN architectures used for both feature extraction and classification in fine-tuned models. The same CNN architectures are used for feature extraction in hybrid models; however, the Support Vector Machines (SVM) and the Quadratic Discriminant Analysis (QDA) are the classifiers used for classification. Among all models, the fine-tuned VGG-16 performs best, its classification accuracy is 99.81%. Our hybrid models are efficient in detecting WBC as well. 98.44% is the classification accuracy of the VGG-16+SVM model, and 98.19% is the accuracy of the MobileNet+SVM. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
19. Brain tumor segmentation and classification with hybrid clustering, probabilistic neural networks.
- Author
-
Javeed, M.D., Nagaraju, Regonda, Chandrasekaran, Raja, Rajulu, Govinda, Tumuluru, Praveen, Ramesh, M., Suman, Sanjay Kumar, and Shrivastava, Rajeev
- Subjects
ARTIFICIAL neural networks ,DEEP learning ,BRAIN tumors ,TUMOR classification ,ARTIFICIAL intelligence ,MAGNETIC resonance imaging - Abstract
The process of partitioning into different objects of an image is segmentation. In different major fields like face tracking, Satellite, Object Identification, Remote Sensing and majorly in medical field segmentation process is very important to find the different objects in the image. To investigate the functions and processes of human boy in radiology magnetic resonance imaging (MRI) will be used. MRI technique is using in many hospitals for the diagnosis purpose widely in finding the stage of a particular disease. In this paper, we proposed a new method for detecting the tumor with enhanced performance over traditional techniques such as K-Means Clustering, fuzzy c means (FCM). Different research methods have been proposed by researchers to detect the tumor in brain. To classify normal and abnormal form of brain, a system for screening is discussed in this paper which is developed with a framework of artificial intelligence with deep learning probabilistic neural networks by focusing on hybrid clustering for segmentation on brain image and crystal contrast enhancement. Feature's extraction and classification are included in the developing process. Performance in Simulation of proposed design has shown the superior results than the traditional methods. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
20. Point Cloud Deep Learning Network Based on Local Domain Multi-Level Feature.
- Author
-
Han, Xianquan, Chen, Xijiang, Deng, Hui, Wan, Peng, and Li, Jianzhou
- Subjects
POINT cloud ,DEEP learning ,FEATURE extraction - Abstract
Point cloud deep learning networks have been widely applied in point cloud classification, part segmentation and semantic segmentation. However, current point cloud deep learning networks are insufficient in the local feature extraction of the point cloud, which affects the accuracy of point cloud classification and segmentation. To address this issue, this paper proposes a local domain multi-level feature fusion point cloud deep learning network. First, dynamic graph convolutional operation is utilized to obtain the local neighborhood feature of the point cloud. Then, relation-shape convolution is used to extract a deeper-level edge feature of the point cloud, and max pooling is adopted to aggregate the edge features. Finally, point cloud classification and segmentation are realized based on global features and local features. We use the ModelNet40 and ShapeNet datasets to conduct the comparison experiment, which is a large-scale 3D CAD model dataset and a richly annotated, large-scale dataset of 3D shapes. For ModelNet40, the overall accuracy (OA) of the proposed method is similar to DGCNN, RS-CNN, PointConv and GAPNet, all exceeding 92%. Compared to PointNet, PointNet++, SO-Net and MSHANet, the OA of the proposed method is improved by 5%, 2%, 3% and 2.6%, respectively. For the ShapeNet dataset, the mean Intersection over Union (mIoU) of the part segmentation achieved by the proposed method is 86.3%, which is 2.9%, 1.4%, 1.7%, 1.7%, 1.2%, 0.1% and 1.0% higher than PointNet, RS-Net, SCN, SPLATNet, DGCNN, RS-CNN and LRC-NET, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
21. Rotating machinery fault diagnosis based on feature extraction via an unsupervised graph neural network.
- Author
-
Feng, Jing, Bao, Shouyang, Xu, Xiaobin, Zhang, Zhenjie, Hou, Pingzhi, Steyskal, Felix, and Dustdar, Schahram
- Subjects
FAULT diagnosis ,DEEP learning ,FEATURE extraction ,ROTATING machinery ,K-nearest neighbor classification ,INTELLIGENCE levels ,ROLLER bearings ,SIGNAL sampling - Abstract
Fault diagnosis is an essential process for the health maintenance of rotating machinery. With the development of AI technology, many deep learning-based methods have been applied to fault diagnosis to enhance the intelligence level of equipment maintenance. Such methods normally need a large amount of labeled data for model training. However, label acquisition is a difficult task that requires extensive human labor. To address these issues, a fault diagnosis method based on feature extraction via an unsupervised graph neural network is proposed in this paper. In the proposed method, the K-nearest neighbor approach is adopted to construct a fault graph from the collected signals, thereby providing extra relationship information for fine feature mining. Then, the GraphSAGE model is trained on the constructed graph in an unsupervised way, that is, it does not need labeled data, to extract features of each signal sample. Based on the extracted features, some traditional classifiers are adopted to identify the fault types. The proposed model is evaluated on a rolling bearing dataset provided by the University of Paderborn and a motor rotor dataset collected by a constructed motor rotor system. Compared with some traditional deep learning-based fault diagnosis methods, the proposed model can achieve more accurate diagnoses even when there are only a few labeled samples. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
22. Environmental Sound Classification Based on CAR-Transformer Neural Network Model.
- Author
-
Li, Huaicheng, Chen, Aibin, Yi, Jizheng, Chen, Wenjie, Yang, Daowu, Zhou, Guoxiong, and Peng, Weixiong
- Subjects
FEATURE extraction ,DEEP learning ,CLASSIFICATION ,SOUNDS ,WHITE noise - Abstract
Environment Sound Classification (ESC) has been a challenging task in the audio field due to the different types of ambient sounds involved. In this paper, we propose a method for the ESC tasks based on the CAR-Transformer neural network model, which includes stages of sound sample pre-processing, deep learning-based feature extraction, and classifier classification. We convert the one-dimensional audio signal into two-dimensional Mel Frequency Cepstral Coefficients (MFCC) and use them as the feature map of the audio. The CAR-Transformer model was used for feature extraction, and after dimensionality reduction of the extracted feature map, we use the fully connected layer as a classifier of the feature map to obtain the final results. The method achieves a classification accuracy of 96.91% on the UrbanSound8K dataset, while the number of parameters in the model is only 0.16 M. The results of this paper were compared with other state-of-art research. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
23. Arabic spam tweets classification using deep learning.
- Author
-
Kaddoura, Sanaa, Alex, Suja A., Itani, Maher, Henno, Safaa, AlNashash, Asma, and Hemanth, D. Jude
- Subjects
MACHINE learning ,DEEP learning ,SPAM email ,ONLINE social networks ,SUPPORT vector machines ,FEATURE extraction - Abstract
With the increased use of social network sites, such as Twitter, attackers exploit these platforms to spread counterfeit content. Such content can be fake advertisements or illegal content. Classifying such content is a challenging task, especially in Arabic. The Arabic language has a complex structure and makes classification tasks more difficult. This paper presents an approach to classifying Arabic tweets using classical machine learning (non-deep machine learning) and deep learning techniques. Tweets corpus were collected through Twitter API and labelled manually to get a reliable dataset. For an efficient classifier, feature extraction is applied to the corpus dataset. Then, two learning techniques are used for each feature extraction technique on the created dataset using N-gram models (uni-gram, bi-gram, and char-gram). The applied classical machine learning algorithms are support vector machines, neural networks, logistics regression, and naïve Bayes. Global vector (GloVe) and fastText learning models are utilised for the deep learning approaches. The Precision, Recall, and F1-score are the suggested performance measures calculated in this paper. Afterwards, the dataset is increased using the synthetic minority oversampling technique class to create a balanced dataset. After applying the classical machine learning models, the experimental results show that the neural network algorithm outperforms the other algorithms. Moreover, the GloVe outperforms the fastText model for the deep learning approach. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
24. Applications of Deep Learning in News Text Classification.
- Author
-
Zhang, Menghan
- Subjects
DEEP learning ,PROBLEM solving ,FEATURE extraction ,CLASSIFICATION ,ALGORITHMS - Abstract
The advancement in technology is taking place with an accelerating pace across the globe. With the increasing expansion and technological advancement, a vast volume of text data are generated everyday, in the form of social media platform, websites, company data, healthcare data, and news. Indeed, it is a difficult task to extract intriguing patterns from the text data, such as opinions, summaries, and facts, having varying length. Because of the problems of the length of text data and the difficulty of feature value extraction in news, this paper proposes a news text classification method based on the combination of deep learning (DL) algorithms. In order to classify the text data, the earlier approaches use a single word vector to express text information and only the information of the relationship between words were considered, but the relationship between words and categories was ignored which indeed is an important factor for the classification of news text. This paper follows the idea of a customized algorithm which is the combination of DL algorithms such as CNN, LSTM, and MLP and proposes a customized DCLSTM-MLP model for the classification of news text data. The proposed model is expressed in parallel with word vector and word dispersion. The relationship among words is represented by the word vector as an input of the CNN module, and the relationship between words and categories is represented by a discrete vector as an input of the MLP module in order to realize comprehensive learning of spatial feature information, time-series feature information, and relationship between words and categories of news text. To check the stability and performance of the proposed method, multiple experiments were performed. The experimental results showed that the proposed method solves the problems of text length, difficulty of feature extraction in the news text, and classification of news text in an effective way and attained better accuracy, recall rate, and comprehensive value as compared to the other models. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
25. A comprehensive survey on leaf disease identification & classification.
- Author
-
Bhagat, Monu and Kumar, Dilip
- Subjects
NOSOLOGY ,FEATURE selection ,FEATURE extraction ,PLANT diseases ,IMAGE processing ,COMPUTER vision - Abstract
This paper presents survey on various techniques used to classify plants and its disease. Classification is concerned with classifying each sample into different classes. Classification is a method of separating a healthy and diseased leaf on its morphological features such as texture, color, shape, pattern and so on. Due to resemblance in the visual properties among plants, sorting and classification are complicated to carry out especially in large area. There are various methods based on image processing techniques and computer vision. Choosing the suitable classification technique is quite difficult as the result varies on different input data. Classification of leaf diseases in plants has wide applications in different fields such as agriculture and biological research. This paper provides a general idea of few existing methods, its pros and cons, state of art of different techniques used by several authors in leaf disease identification and classification such as preprocessing techniques, feature extraction and selection techniques, datasets used, classifiers and performance metrics. Apart from these some challenges and research gaps are identified and their probable solutions are pointed out. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
26. Arabic Handwritten Recognition Using Deep Learning: A Survey.
- Author
-
Alrobah, Naseem and Albahli, Saleh
- Subjects
HANDWRITING recognition (Computer science) ,DEEP learning ,TEXT recognition ,FEATURE extraction ,ENGLISH language ,ARABIC language - Abstract
In recent times, many research projects and experiments target machines that automatically recognize handwritten characters, but most of them are done in Latin. Recognizing handwritten Arabic characters is a complicated process compared to English and other languages as a nature of Arabic words. In the past few years, deep learning approaches have been increasingly used in the field of Arabic recognition. This paper aims to categorize, analyze and presents a comprehensive survey in Arabic handwritten recognition literature, focusing on state-of-the-art methods for deep learning in feature extraction. The paper focuses on offline text recognition, with a detailed discussion of the systematic analysis of the literature. Additionally, the paper is critically analyzing the current literature and identifying the problem areas and challenges faced by the previous studies. After investigating the studies, a new classification of the literature is proposed. Besides, an analysis is performed based on the findings, and several issues and challenges related to the recognition of Arabic scripts are discussed. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
27. ASER: An Exhaustive Survey for Speech Recognition based on Methods, Datasets, Challenges, Future Scope.
- Author
-
Patel, Dharil, Amipara, Soham, Sanaria, Malay, Pareek, Preksha, Jayaswal, Ruchi, and Patil, Shruti
- Subjects
SPEECH perception ,DEEP learning ,MACHINE learning ,EMOTIONS ,TRANSFORMER models ,HUMAN-artificial intelligence interaction - Abstract
AI has been used to process the data for decision-making, problem-solving, interaction with humans and to understand human's feelings, emotions and their behavior. In today's world, communication between humans takes place digitally, so human's emotions play a very important role for communication as well as detection and analysis. Although there are many surveys related to emotions from speech already done, selecting appropriate datasets and methods are challenging tasks. This survey will primarily concentrate on efficient techniques, including Machine Learning, Deep Learning, and transformer-based approaches, while also providing brief descriptions of existing challenges and outlining future prospects. Additionally, this paper provides a comparative analysis of various datasets and techniques employed by researchers. After conducting the survey, we discovered that deep learning and transformer-based techniques are more effective and yield superior performance results. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
28. Classification of Breast Cancer Histopathological Images Using Transfer Learning with DenseNet121.
- Author
-
Potsangbam, Jacinta and Shuleenda Devi, Salam
- Subjects
BREAST cancer ,TUMOR classification ,DATA augmentation ,HISTOPATHOLOGY ,DEEP learning ,FEATURE extraction - Abstract
Breast cancer (BC) continues to be a prominent issue in global public health, emphasizing the need for precise and timely detection. This paper employs a deep learning (DL) approach to introduce an extensive methodology for categorizing histopathology images associated with breast cancer into automated binary classifications. The proposed framework architecture is validated on the standard database which is accessible to the public, called Breast Cancer Histopathological Database (BreakHis). Data augmentation techniques are employed for the pre-processing stage. This paper uses the DenseNet 121 pre-trained model for feature extraction and fully connected layers (FCL) to fine-tune the model further. In this experiment, the highest accuracy of 96.09% is observed with the 100X. The experimental results showed an improvement in accuracy for all the magnification factors compared to the existing works. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
29. Cross-Modality Interaction-Based Traffic Accident Classification.
- Author
-
Oh, Changhyeon and Ban, Yuseok
- Subjects
TRAFFIC accidents ,DEEP learning ,TRAFFIC monitoring ,FEATURE extraction ,RAINFALL ,CLASSIFICATION - Abstract
Traffic accidents on the road lead to serious personal and material damage. Furthermore, preventing secondary accidents caused by traffic accidents is crucial. As various technologies for detecting traffic accidents in videos using deep learning are being researched, this paper proposes a method to classify accident videos based on a video highlight detection network. To utilize video highlight detection for traffic accident classification, we generate information using the existing traffic accident videos. Moreover, we introduce the Car Crash Highlights Dataset (CCHD). This dataset contains a variety of weather conditions, such as snow, rain, and clear skies, as well as multiple types of traffic accidents. We compare and analyze the performance of various video highlight detection networks in traffic accident detection, thereby presenting an efficient video feature extraction method according to the accident and the optimal video highlight detection network. For the first time, we have applied video highlight detection networks to the task of traffic accident classification. In the task, the most superior video highlight detection network achieves a classification performance of up to 79.26% when using video, audio, and text as inputs, compared to using video and text alone. Moreover, we elaborated the analysis of our approach in the aspects of cross-modality interaction, self-attention and cross-attention, feature extraction, and negative loss. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
30. Hybrid model for brain tumor detection using convolution neural networks.
- Author
-
Kuntiyellannagari, Bhagyalaxmi, Dwarakanath, Bhoopalan, and Reddy, Panuganti VijayaPal
- Subjects
CONVOLUTIONAL neural networks ,BRAIN tumors ,MACHINE learning ,FEATURE extraction ,MAGNETIC resonance imaging ,DEEP learning - Abstract
The development of abnormal cells in the brain, some of which may turn out to be cancerous, is known as a brain tumor. Magnetic resonance imaging (MRI) is the most common technique for detecting brain tumors. Information about the abnormal tissue growth in the brain is visible from the MRI scans. In most research papers, machine learning (ML) and deep learning (DL) algorithms are applied to detect brain tumors. The radiologist can make speedy decisions because of this prediction. The proposed work creates a hybrid convolution neural networks (CNN) model and logistic regression (LR). The visual geometry group16 (VGG16) which was pretrained model is used for the extraction of features. To reduce the complexity, we eliminated the last eight layers of VGG16. From this transformed model, the features are extracted in the form of a vector array. These features fed into different ML classifiers like support vector machine (SVM), and Naïve Bayes (NB), LR, extreme gradient boosting (XGBoost), AdaBoost, and random forest for training and testing. The performance of different classifiers is compared. The CNN-LR hybrid combination outperformed the remaining classifiers. The evaluation measures such as Recall, precision, F1-score, and accuracy of the proposed CNN-LR model are 94%, 94%, 94%, and 91% respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
31. A knowledge distillation-based multi-scale relation-prototypical network for cross-domain few-shot defect classification.
- Author
-
Zhao, Jiaqi, Qian, Xiaolong, Zhang, Yunzhou, Shan, Dexing, Liu, Xiaozheng, Coleman, Sonya, and Kerr, Dermot
- Subjects
SURFACE defects ,CLASSIFICATION ,DEEP learning ,FEATURE extraction - Abstract
Surface defect classification plays a very important role in industrial production and mechanical manufacturing. However, there are currently some challenges hindering its use. The first is the similarity of different defect samples makes classification a difficult task. Second, the lack of defect samples leads to poor accuracies when using deep learning methods. In this paper, we first design a novel backbone network, ResMSNet, which draws on the idea of multi-scale feature extraction for small discriminative regions in defect samples. Then, we introduce few-shot learning for defect classification and propose a Relation-Prototypical network (RPNet), which combines the characteristics of ProtoNet and RelationNet and provides classification by linking the prototypes distances and the nonlinear relation scores. Next, we consider a more realistic scenario where the base dataset for training the model and target defect dataset for applying the model are usually obtained from domains with large differences, called cross-domain few-shot learning. Hence, we further improve RPNet to KD-RPNet inspired by knowledge distillation methods. Through extensive comparative experiments and ablation experiments, we demonstrate that either our ResMSNet or RPNet proves its effectiveness and KD-RPNet outperforms other state-of-the-art approaches for few-shot defect classification. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
32. Learning features from irrelevant domains through deep neural network.
- Author
-
Wen, Pengcheng, Zhang, Yuhan, and Wen, Guihua
- Subjects
FEATURE selection ,DEEP learning ,FEATURE extraction ,MACHINE learning - Abstract
Features of data are much critical to the classification. However, when only small data are available, suitable features can not be easily obtained, easily leading to the bad classification performance. This paper propose a novel approach to automatically learns features from the irrelevant domain with much discriminative features for the given classification task. It first computes as the learning objectives the central vectors of each class in the irrelevant domain, and then uses machine learning method to automatically learn features for each sample in the target domain from these objectives. The merits of our method lie in that unlike the transfer learning, our method does not require the similarity between two domains. It can learn features from much discriminative domains. Its learned features are not limited to its original ones, unlike feature selection and feature extraction methods, so that the classification performance with the learned features can be better. Finally, our method is much general, simple, and efficient. Lots of experimental results validated the proposed method. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
33. DeepSVDNet: A Deep Learning-Based Approach for Detecting and Classifying Vision-Threatening Diabetic Retinopathy in Retinal Fundus Images.
- Author
-
Bilal, Anas, Imran, Azhar, Baig, Talha Imtiaz, Xiaowen Liu, Haixia Long, Alzahrani, Abdulkareem, and Shafiq, Muhammad
- Subjects
DEEP learning ,DIABETIC retinopathy ,RETINAL disease diagnosis ,ARTIFICIAL intelligence ,FEATURE extraction - Abstract
Artificial Intelligence (AI) is being increasingly used for diagnosing Vision-Threatening Diabetic Retinopathy (VTDR), which is a leading cause of visual impairment and blindness worldwide. However, previous automated VTDR detection methods have mainly relied on manual feature extraction and classification, leading to errors. This paper proposes a novel VTDR detection and classification model that combines different models through majority voting. Our proposed methodology involves preprocessing, data augmentation, feature extraction, and classification stages. We use a hybrid convolutional neural network-singular value decomposition (CNN-SVD) model for feature extraction and selection and an improved SVM-RBF with a Decision Tree (DT) and K-Nearest Neighbor (KNN) for classification. We tested ourmodel on the IDRiD dataset and achieved an accuracy of 98.06%, a sensitivity of 83.67%, and a specificity of 100% for DR detection and evaluation tests, respectively. Our proposed approach outperforms baseline techniques and provides a more robust and accurate method for VTDR detection. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
34. Hyperspectral image classification via active learning and broad learning system.
- Author
-
Huang, Huifang, Liu, Zhi, Chen, C. L. Philip, and Zhang, Yun
- Subjects
IMAGE recognition (Computer vision) ,ACTIVE learning ,INSTRUCTIONAL systems ,FEATURE extraction ,MACHINE learning ,MULTISPECTRAL imaging ,DEEP learning - Abstract
Hyperspectral image (HSI) classification has continued to be a hot research topic in recent years, and the broad learning system (BLS) has been considered by scholars for the classification of HSIs due to its superior internal structure. Different from the traditional HSI classification mechanism, this paper proposes an active broad learning system approach for HSI classification. The spectral and spatial features of the image are extracted using principal component analysis and local binary patterns, respectively. Then, the vector fusion of the above two features is utilized as the input of the BLS and trained to obtain pre-labels of the samples. The next training samples are selected among the pre-labels by active learning. Unlike other classification algorithms, the method proposed in this paper utilizes active learning (AL) to select high-quality samples for training, thereby reducing the number of samples used and the cost of sample labeling. In addition, the use of incremental learning in broad learning significantly reduces the training time and improves the classification accuracy. The algorithm proposed in this paper is more effective compared to other state-of-the-art algorithms on three HSI datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
35. 6‐4: Deep Learning for Classification of Repairable Defects in Display Panels Using Multi‐Modal Data.
- Author
-
Balakrishnan, Kaushik, Cheng, Qisen, Lee, Janghwan, Jeong, Deokyeong, Kim, Eunwoo, and Kim, Jaewon
- Subjects
DEEP learning ,MANUFACTURING defects ,FEATURE extraction ,TRANSFORMER models ,CLASSIFICATION - Abstract
This paper uses Deep Learning to classify if a display panel with defects in the manufacturing line can be repaired. Both tabular data and images are fused together to make predictions, with separate feature extraction undertaken for each of the modalities. The model's predictions achieve high Average Precision as well as robust Precision values in the high Recall regions, which makes it practical for deployment. We also demonstrate superior results with multi‐modal data compared to only tabular data. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
36. A Few Shot Classification Methods Based on Multiscale Relational Networks.
- Author
-
Zheng, Wenfeng, Tian, Xia, Yang, Bo, Liu, Shan, Ding, Yueming, Tian, Jiawei, and Yin, Lirong
- Subjects
DEEP learning ,FEATURE extraction ,FEATURE selection ,CLASSIFICATION ,SAMPLE size (Statistics) ,RELATIONAL databases - Abstract
Learning information from a single or a few samples is called few-shot learning. This learning method will solve deep learning's dependence on a large sample. Deep learning achieves few-shot learning through meta-learning: "how to learn by using previous experience". Therefore, this paper considers how the deep learning method uses meta-learning to learn and generalize from a small sample size in image classification. The main contents are as follows. Practicing learning in a wide range of tasks enables deep learning methods to use previous empirical knowledge. However, this method is subject to the quality of feature extraction and the selection of measurement methods supports set and the target set. Therefore, this paper designs a multi-scale relational network (MSRN) aiming at the above problems. The experimental results show that the simple design of the MSRN can achieve higher performance. Furthermore, it improves the accuracy of the datasets within fewer samples and alleviates the overfitting situation. However, to ensure that uniform measurement applies to all tasks, the few-shot classification based on metric learning must ensure the task set's homologous distribution. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
37. Underwater Object Classification Method Based on Depthwise Separable Convolution Feature Fusion in Sonar Images.
- Author
-
Gong, Wenjing, Tian, Jie, and Liu, Jiyuan
- Subjects
SONAR imaging ,IMAGE fusion ,FEATURE extraction ,DEEP learning ,CLASSIFICATION - Abstract
In order to improve the accuracy of underwater object classification, according to the characteristics of sonar images, a classification method based on depthwise separable convolution feature fusion is proposed. Firstly, Markov segmentation is used to segment the highlight and shadow regions of the object to avoid the loss of information caused by simultaneous segmentation. Secondly, depthwise separable convolution is used to learn the deep information of images for feature extraction, which produces less network computation. Thirdly, features of highlight and shadow regions are fused by the parallel network structure, and pyramid pooling is added to extract the multi-scale information. Finally, the full connection layers are used to achieve object classification through the Softmax function. Experiments are conducted on simulated and real data. Results show that the method proposed in this paper achieve superior performance compared with other models, and it also has certain flexibility. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
38. Classification and Segmentation of Diabetic Retinopathy: A Systemic Review.
- Author
-
Shaukat, Natasha, Amin, Javeria, Sharif, Muhammad Imran, Sharif, Muhammad Irfan, Kadry, Seifedine, and Sevcik, Lukas
- Subjects
DIABETIC retinopathy ,DEEP learning ,ARTIFICIAL intelligence ,FEATURE extraction ,FUNDUS oculi ,COMPUTER-assisted image analysis (Medicine) ,CLASSIFICATION - Abstract
Diabetic retinopathy (DR) is a major reason of blindness around the world. The ophthalmologist manually analyzes the morphological alterations in veins of retina, and lesions in fundus images that is a time-taking, costly, and challenging procedure. It can be made easier with the assistance of computer aided diagnostic system (CADs) that are utilized for the diagnosis of DR lesions. Artificial intelligence (AI) based machine/deep learning methods performs vital role to increase the performance of the detection process, especially in the context of analyzing medical fundus images. In this paper, several current approaches of preprocessing, segmentation, feature extraction/selection, and classification are discussed for the detection of DR lesions. This survey paper also includes a detailed description of DR datasets that are accessible by the researcher for the identification of DR lesions. The existing methods limitations and challenges are also addressed, which will assist invoice researchers to start their work in this domain. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
39. Harnessing LSTM Classifier to Suggest Nutrition Diet for Cancer Patients.
- Author
-
Raguvaran, S., Anandamurugan, S., and Zubair Rahman, A. M. J. Md.
- Subjects
CANCER patients ,DIET ,NUTRITION ,DEEP learning ,FEATURE extraction ,IMAGE analysis - Abstract
A customized nutrition-rich diet plan is of utmost importance for cancer patients to intake healthy and nutritious foods that help them to be strong enough to maintain their body weight and body tissues. Consuming nutrition-rich diet foods will prevent them from the side effects caused before and after treatment thereby minimizing it. This work is proposed here to provide them with an effective diet assessment plan using deep learning-based automated medical diet system. Hence, an Enhanced Long-Short Term Memory (E-LSTM) has been proposed in this paper, especially for cancer patients. This proposed method will be very useful for cancer patients as this would help them predict the foods which can be consumed by them based on the nutrition analysis of food images. The classification will be performed in E-LSTM by analyzing the two datasets, one with food images and another with cancer patients' details. Following an in-depth analysis of the major research papers concerning deep learning strategies to identify the foods along with their nutrition composition, this method has been identified as one of the finest deep learning approaches that are used for classification especially. This work has been identified as the first work producing a new layer for feature extraction and providing nutrition suggestions, especially for cancer patients using the LSTM technique. The accuracy of prediction and classification will be improved by the dedicated layer for feature extraction in E-LSTM. Hence, it is proved that this proposed method outperforms all other existing techniques in terms of F1 Score, Precision, Recall, Classification accuracy, Training loss and Validation loss. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
40. LSTM Neural Network for Beat Classification in ECG Identity Recognition.
- Author
-
Xin Liu, Yujuan Si, and Di Wang
- Subjects
DEEP learning ,ELECTROCARDIOGRAPHY ,LYAPUNOV exponents ,CLASSIFICATION ,FEATURE extraction ,ARTIFICIAL neural networks - Abstract
As a biological signal existing in the human living body, the electrocardiogram (ECG) contains abundantly personal information and fulfils the basic characteristics of identity recognition. It has been widely used in the field of individual identification research in recent years. The common process of identity recognition includes three steps: ECG signals preprocessing, feature extraction and processing, beat classification recognition. However, the existing ECG classification models are sensitive to limitations of database type and extracted features dimension, which makes classification accuracy difficult to improve and cannot meet the needs of practical applications. To tackle the problem, this paper proposes to build an ECG individual recognition model based on a deep Long Short-Term Memory (LSTM) neural network. The LSTM network model has a memory cell and, therefore, it is an expert in handling long time ECG signals. With deeper learning, the nonlinear expression ability of the ECG beat classification model is gradually enhancing. The paper adopts two stacked LSTM models as hidden layers in the neural network; the Softmax layer is used as a classification layer to identify an individual. Then, low-level morphological features and deep-level chaotic features (Lyapunov exponent) are extracted to verify the feasibility of the deep LSTM network for classification. The model is respectively applied to a healthy human database and a human with a heart disease database. Experimental results show that extracting simple low-level features and chaotic features both achieve better classification performance. So, the robustness of the LSTM classification model is verified. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
41. Using deep learning to assess the function of gastroesophageal flap valve according to the Hill classification system.
- Author
-
Ge, Zhenyang, Fang, Youjiang, Chang, Jiuyang, Yu, Zequn, Qiao, Yu, Zhang, Jing, Yang, Xin, and Duan, Zhijun
- Subjects
DEEP learning ,ESOPHAGOGASTRIC junction ,FEATURE extraction ,VALVES ,CLASSIFICATION - Abstract
The endoscopic Hill classification of the gastroesophageal flap valve (GEFV) is of great importance for understanding the functional status of the esophagogastric junction (EGJ). Deep learning (DL) methods have been extensively employed in the area of digestive endoscopy. To improve the efficiency and accuracy of the endoscopist's Hill classification and assist in incorporating it into routine endoscopy reports and GERD assessment examinations, this study first employed DL to establish a four-category model based on the Hill classification. A dataset consisting of 3256 GEFV endoscopic images has been constructed for training and evaluation. Furthermore, a new attention mechanism module has been provided to improve the performance of the DL model. Combined with the attention mechanism module, numerous experiments were conducted on the GEFV endoscopic image dataset, and 12 mainstream DL models were tested and evaluated. The classification accuracy of the DL model and endoscopists with different experience levels was compared. 12 mainstream backbone networks were trained and tested, and four outstanding feature extraction backbone networks (ResNet-50, VGG-16, VGG-19, and Xception) were selected for further DL model development. The ResNet-50 showed the best Hill classification performance; its area under the curve (AUC) reached 0.989, and the classification accuracy (93.39%) was significantly higher than that of junior (74.83%) and senior (78.00%) endoscopists. The DL model combined with the attention mechanism module in this paper demonstrated outstanding classification performance based on the Hill grading and has great potential for improving the accuracy of the Hill classification by endoscopists. A new attention mechanism module has been proposed and integrated into the DL model. According to our knowledge, this is the first study to establish a four-category DL model based on the Hill grading. The DL model demonstrated outstanding classification performance based on the Hill grading and has great potential for improving the accuracy of the Hill classification by endoscopists. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
42. Effective framework for human action recognition in thermal images using capsnet technique.
- Author
-
Srihari, Pasala, Harikiran, Jonnadula, Sai Chandana, B., and Surendra Reddy, Vinta
- Subjects
HUMAN activity recognition ,CONVOLUTIONAL neural networks ,THERMOGRAPHY ,IMAGE recognition (Computer vision) ,RECURRENT neural networks ,DEEP learning - Abstract
Recognizing human activity is the process of using sensors and algorithms to identify and classify human actions based on the data collected. Human activity recognition in visible images can be challenging due to several factors of the lighting conditions can affect the quality of images and, consequently, the accuracy of activity recognition. Low lighting, for example, can make it difficult to distinguish between different activities. Thermal cameras have been utilized in earlier investigations to identify this issue. To solve this issue, we propose a novel deep learning (DL) technique for predicting and classifying human actions. In this paper, initially, to remove the noise from the given input thermal images using the mean filter method and then normalize the images using with min-max normalization method. After that, utilizing Deep Recurrent Convolutional Neural Network (DRCNN) technique to segment the human from thermal images and then retrieve the features from the segmented image So, here we choose a fully connected layer of DRCNN as the segmentation layer is utilized for segmentation, and then the multi-scale convolutional neural network layer of DRCNN is used to extract the features from segmented images to detect human actions. To recognize human actions in thermal pictures, the DenseNet-169 approach is utilized. Finally, the CapsNet technique is used to classify the human action types with Elephant Herding Optimization (EHO) algorithm for better classification. In this experiment, we select two thermal datasets the LTIR dataset and IITR-IAR dataset for good performance with accuracy, precision, recall, and f1-score parameters. The proposed approach outperforms "state-of-the-art" methods for action detection on thermal images and categorizes the items. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
43. Colon histology slide classification with deep-learning framework using individual and fused features.
- Author
-
Rajinikanth, Venkatesan, Kadry, Seifedine, Mohan, Ramya, Rama, Arunmozhi, Khan, Muhammad Attique, and Kim, Jungeun
- Subjects
COLON cancer ,EARLY detection of cancer ,DEEP learning ,FEATURE extraction ,K-nearest neighbor classification - Abstract
Cancer occurrence rates are gradually rising in the population, which reasons a heavy diagnostic burden globally. The rate of colorectal (bowel) cancer (CC) is gradually rising, and is currently listed as the third most common cancer globally. Therefore, early screening and treatments with a recommended clinical protocol are necessary to trat cancer. The proposed research aim of this paper to develop a Deep-Learning Framework (DLF) to classify the colon histology slides into normal/cancer classes using deep-learning-based features. The stages of the framework include the following: (ⅰ) Image collection, resizing, and pre-processing; (ⅱ) Deep-Features (DF) extraction with a chosen scheme; (ⅲ) Binary classification with a 5-fold cross-validation; and (ⅳ) Verification of the clinical significance. This work classifies the considered image database using the follwing: (ⅰ) Individual DF, (ⅱ) Fused DF, and (ⅲ) Ensemble DF. The achieved results are separately verified using binary classifiers. The proposed work considered 4000 (2000 normal and 2000 cancer) histology slides for the examination. The result of this research confirms that the fused DF helps to achieve a detection accuracy of 99% with the K-Nearest Neighbor (KNN) classifier. In contrast, the individual and ensemble DF provide classification accuracies of 93.25 and 97.25%, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
44. Hybrid multi-modal emotion recognition framework based on InceptionV3DenseNet.
- Author
-
Alamgir, Fakir Mashuque and Alam, Md. Shafiul
- Abstract
Emotion recognition is one of the most complex research areas as individuals express emotional cues based on several modalities such as audio, facial expressions, and language. The recognition of emotion from one of the modalities is not always feasible as the single modalities are disturbed by several factors. The existing models cannot attain the maximum accuracy in exactly identifying the expressions of individuals. In this paper, a novel hybrid multi-modal emotion recognition framework InceptionV3DenseNet is proposed for improving the recognition accuracy. Initially contextual features are extracted from different modalities such as video, audio and text. From the video modality, the features such as shot length, lighting key, motion and color are extracted. Zero-crossing rate, Mel frequency cepstral coefficient (MFCC), energy and pitch are extracted from the audio modality and the unigram, bigram and TF-IDF are extracted from the textual modality. In feature extraction, high level features are extracted with better generalization capability. The extracted features are fused using the multi-set integrated canonical correlation analysis (MICCA) and are provided as the input to the proposed hybrid network model. It detects the correlation between multimodal features to provide better performance with single learning phase. Then the proposed hybrid deep learning model is utilized to classify emotional states by considering the accuracy and reliability. The work simulations are conducted in the MATLAB platform and evaluated using the MELD and RAVDESS datasets. The outcomes proved that the proposed model is more efficient and accurate than the compared models and attained an overall accuracy rate of 74.87% in MELD and 95.25% in RAVDESS. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
45. AnimeNet: A Deep Learning Approach for Detecting Violence and Eroticism in Animated Content.
- Author
-
Yixin Tang
- Subjects
CONVOLUTIONAL neural networks ,DEEP learning ,SEXUAL excitement ,TRAFFIC accidents ,FEATURE extraction ,SOCIAL impact - Abstract
Cartoons serve as significant sources of entertainment for children and adolescents. However, numerous animated videos contain unsuitable content, such as violence, eroticism, abuse, and vehicular accidents. Current content detection methods rely on manual inspection, which is resource-intensive, time-consuming, and not always reliable. Therefore, more efficient detection methods are necessary to safeguard young viewers. This paper addresses this significant problem by proposing a novel deep learning-based system, AnimeNet, designed to detect varying degrees of violent and erotic content in videos. AnimeNet utilizes a novel Convolutional Neural Network (CNN) model to extract image features effectively, classifying violent and erotic scenes in videos and images. The novelty of the work lies in the introduction of a novel channel-spatial attention module, enhancing the feature extraction performance of the CNN model, an advancement over previous efforts in the literature. To validate the approach, I compared AnimeNet with state-of-the-art classification methods, including ResNet, RegNet, ConvNext, ViT, and MobileNet. These were used to identify violent and erotic scenes within specific video frames. The results showed that AnimeNet outperformed these models, proving it to be well-suited for real-time applications in videos or images. This work presents a significant leap forward in automatic content detection in animation, offering a high-accuracy solution that is less resource-intensive and more reliable than current methods. The proposed approach enables it possible to better protect young audiences from exposure to unsuitable content, underlining its importance and potential for broad social impact. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
46. Deep Convolutional Neural Network–Based Computer-Aided Detection System for COVID-19 Using Multiple Lung Scans: Design and Implementation Study
- Author
-
Mehrad Aria, Hassan Abolghasemi, Ramezan Jafari, Farkhondeh Asadi, Mustafa Ghaderzadeh, and Davood Bashash
- Subjects
Machine vision ,Computer science ,Computer applications to medicine. Medical informatics ,Feature extraction ,R858-859.7 ,coronavirus ,Datasets as Topic ,convolutional neural network ,Health Informatics ,CAD ,02 engineering and technology ,Machine learning ,computer.software_genre ,Convolutional neural network ,030218 nuclear medicine & medical imaging ,03 medical and health sciences ,0302 clinical medicine ,0202 electrical engineering, electronic engineering, information engineering ,Humans ,Diagnosis, Computer-Assisted ,computed tomography scan ,Lung ,Pandemics ,computer-aided detection ,Hyperparameter ,Original Paper ,model ,SARS-CoV-2 ,business.industry ,pandemic ,Deep learning ,COVID-19 ,deep learning ,machine vision ,artificial intelligence ,Data set ,Early Diagnosis ,machine learning ,classification ,020201 artificial intelligence & image processing ,Artificial intelligence ,Public aspects of medicine ,RA1-1270 ,Tomography, X-Ray Computed ,business ,computer ,Test data - Abstract
Background Owing to the COVID-19 pandemic and the imminent collapse of health care systems following the exhaustion of financial, hospital, and medicinal resources, the World Health Organization changed the alert level of the COVID-19 pandemic from high to very high. Meanwhile, more cost-effective and precise COVID-19 detection methods are being preferred worldwide. Objective Machine vision–based COVID-19 detection methods, especially deep learning as a diagnostic method in the early stages of the pandemic, have been assigned great importance during the pandemic. This study aimed to design a highly efficient computer-aided detection (CAD) system for COVID-19 by using a neural search architecture network (NASNet)–based algorithm. Methods NASNet, a state-of-the-art pretrained convolutional neural network for image feature extraction, was adopted to identify patients with COVID-19 in their early stages of the disease. A local data set, comprising 10,153 computed tomography scans of 190 patients with and 59 without COVID-19 was used. Results After fitting on the training data set, hyperparameter tuning, and topological alterations of the classifier block, the proposed NASNet-based model was evaluated on the test data set and yielded remarkable results. The proposed model's performance achieved a detection sensitivity, specificity, and accuracy of 0.999, 0.986, and 0.996, respectively. Conclusions The proposed model achieved acceptable results in the categorization of 2 data classes. Therefore, a CAD system was designed on the basis of this model for COVID-19 detection using multiple lung computed tomography scans. The system differentiated all COVID-19 cases from non–COVID-19 ones without any error in the application phase. Overall, the proposed deep learning–based CAD system can greatly help radiologists detect COVID-19 in its early stages. During the COVID-19 pandemic, the use of a CAD system as a screening tool would accelerate disease detection and prevent the loss of health care resources.
- Published
- 2021
47. Deep Learning-Based Classification of Raw Hydroacoustic Signal: A Review.
- Author
-
Lin, Xu, Dong, Ruichun, and Lv, Zhichao
- Subjects
DEEP learning ,CLASSIFICATION algorithms ,FEATURE extraction ,SIGNAL processing ,CLASSIFICATION - Abstract
Underwater target recognition is a research component that is crucial to realizing crewless underwater detection missions and has significant prospects in both civil and military applications. This paper provides a comprehensive description of the current stage of deep-learning methods with respect to raw hydroacoustic data classification, focusing mainly on the variety and recognition of vessels and environmental noise from raw hydroacoustic data. This work not only aims to describe the latest research progress in this field but also summarizes three main elements of the current stage of development: feature extraction in the time and frequency domains, data enhancement by neural networks, and feature classification based on deep learning. In this paper, we analyze and discuss the process of hydroacoustic signal processing; demonstrate that the method of feature fusion can be used in the pre-processing stage in classification and recognition algorithms based on raw hydroacoustic data, which can significantly improve target recognition accuracy; show that data enhancement algorithms can be used to improve the efficiency of recognition in complex environments in terms of deep learning network structure; and further discuss the field's future development directions. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
48. A Deep Learning Approach for Detecting Stroke from Brain CT Images Using OzNet.
- Author
-
Ozaltin, Oznur, Coskun, Orhan, Yeniay, Ozgur, and Subasi, Abdulhamit
- Subjects
DEEP learning ,COMPUTED tomography ,FISHER discriminant analysis ,MACHINE learning ,BRAIN imaging ,CONVOLUTIONAL neural networks - Abstract
A brain stroke is a life-threatening medical disorder caused by the inadequate blood supply to the brain. After the stroke, the damaged area of the brain will not operate normally. As a result, early detection is crucial for more effective therapy. Computed tomography (CT) images supply a rapid diagnosis of brain stroke. However, while doctors are analyzing each brain CT image, time is running fast. This circumstance may lead to result in a delay in treatment and making errors. Therefore, we targeted the utilization of an efficient artificial intelligence algorithm in stroke detection. In this paper, we designed hybrid algorithms that include a new convolution neural networks (CNN) architecture called OzNet and various machine learning algorithms for binary classification of real brain stroke CT images. When we classified the dataset with OzNet, we acquired successful performance. However, for this target, we combined it with a minimum Redundancy Maximum Relevance (mRMR) method and Decision Tree (DT), k-Nearest Neighbors (kNN), Linear Discriminant Analysis (LDA), Naïve Bayes (NB), and Support Vector Machines (SVM). In addition, 4096 significant features were obtained from the fully connected layer of OzNet, and we reduced the dimension of features from 4096 to 250 using the mRMR method. Finally, we utilized these machine learning algorithms to classify important features. As a result, OzNet-mRMR-NB was an excellent hybrid algorithm and achieved an accuracy of 98.42% and AUC of 0.99 to detect stroke from brain CT images. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
49. SelfMatch: Robust semisupervised time‐series classification with self‐distillation.
- Author
-
Xing, Huanlai, Xiao, Zhiwen, Zhan, Dawei, Luo, Shouxi, Dai, Penglin, and Li, Ke
- Subjects
SUPERVISED learning ,DEEP learning ,FEATURE extraction ,CLASSIFICATION ,DATA extraction ,DATA mining - Abstract
Over the years, a number of semisupervised deep‐learning algorithms have been proposed for time‐series classification (TSC). In semisupervised deep learning, from the point of view of representation hierarchy, semantic information extracted from lower levels is the basis of that extracted from higher levels. The authors wonder if high‐level semantic information extracted is also helpful for capturing low‐level semantic information. This paper studies this problem and proposes a robust semisupervised model with self‐distillation (SD) that simplifies existing semisupervised learning (SSL) techniques for TSC, called SelfMatch. SelfMatch hybridizes supervised learning, unsupervised learning, and SD. In unsupervised learning, SelfMatch applies pseudolabeling to feature extraction on labeled data. A weakly augmented sequence is used as a target to guide the prediction of a Timecut‐augmented version of the same sequence. SD promotes the knowledge flow from higher to lower levels, guiding the extraction of low‐level semantic information. This paper designs a feature extractor for TSC, called ResNet–LSTMaN, responsible for feature and relation extraction. The experimental results show that SelfMatch achieves excellent SSL performance on 35 widely adopted UCR2018 data sets, compared with a number of state‐of‐the‐art semisupervised and supervised algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
50. Rapid tri-net: breast cancer classification from histology images using rapid tri-attention network.
- Author
-
Salunkhe, Pallavi Bhanudas and Patil, Pravin Sahebrao
- Subjects
RGB color model ,CAPSULE neural networks ,HEALTH facilities ,CELL anatomy ,FEATURE extraction ,DEEP learning ,FEATURE selection - Abstract
Nowadays, people all over the world are facing several problems related to the deadly disease of breast cancer. The research on breast cancer detection using existing techniques shows lesser detection results due to the shortage of medical facilities and manual detection procedures. However, the disease diagnosis becomes very critical when it is detected in the critical or chronic stage. Early detection plays an essential role in the accurate detection and effective treatment of breast cancer, which further minimizes the death rate. Therefore, automated breast cancer detection based on deep learning is proposed in this paper by using histopathological images. The proposed approach involves five steps: image filtering, dual-stage segmentation, feature extraction, feature selection, and classification. Initially, image filtering is performed to execute image resizing, noise elimination, and contrast enhancement. The weighted Guided Image Filtering (Weighted GIF) approach is used for noise removal, and Transformed Optimal Gamma Correction (TOGC) is used for contrast enhancement. To obtain the cellular structures from histology/histopathology images, a dual-stage segmentation using Superpixel Mixed Clustering (SMC) is applied. Then, feature extraction is done by the Gray Level Co-occurrence Matrix with Three-dimensional space (GLCM -3D) and RGB color model to extract texture and color features. Then, the most significant features are selected using Stochastic Diffusion Dynamic Optimization (SDDO). Finally, breast histology images have been classified using Rapid Tri-Attention Residual Dense Capsule Network with Aquila Optimization (Rapid Tri-Net), further categorizing the histology images into various classes. The proposed approach is simulated in the Python platform using BreakHis and BACH datasets and evaluates the performance on the basis of f-measure, recall, accuracy, precision, and specificity. Rapid Tri-Net's performance is related to the recent prevailing framework to attain a fair comparison. As a result, the simulated results clearly showed that the proposed Rapid Tri-Net performed better than the existing approaches. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.