309 results on '"features fusion"'
Search Results
2. A Novel Method for Autism Identification Based on Multi-atlas Features Fusion and Graph Neural Network
- Author
-
Tuerxun, Palidan, Gu, Jian, Chen, Jiaying, Li, Xinhui, Hu, Yue, Liu, Jin, Qian, Yurong, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Lin, Zhouchen, editor, Cheng, Ming-Ming, editor, He, Ran, editor, Ubul, Kurban, editor, Silamu, Wushouer, editor, Zha, Hongbin, editor, Zhou, Jie, editor, and Liu, Cheng-Lin, editor
- Published
- 2025
- Full Text
- View/download PDF
3. Advanced Deep Learning Fusion Model for Early Multi-Classification of Lung and Colon Cancer Using Histopathological Images.
- Author
-
Abd El-Aziz, A. A., Mahmood, Mahmood A., and Abd El-Ghany, Sameh
- Subjects
- *
LUNG cancer , *COLON cancer , *DIGITAL image processing , *EARLY detection of cancer , *DEEP learning - Abstract
Background: In recent years, the healthcare field has experienced significant advancements. New diagnostic techniques, treatments, and insights into the causes of various diseases have emerged. Despite these progressions, cancer remains a major concern. It is a widespread illness affecting individuals of all ages and leads to one out of every six deaths. Lung and colon cancer alone account for nearly two million fatalities. Though it is rare for lung and colon cancers to co-occur, the spread of cancer cells between these two areas—known as metastasis—is notably high. Early detection of cancer greatly increases survival rates. Currently, histopathological image (HI) diagnosis and appropriate treatment are key methods for reducing cancer mortality and enhancing survival rates. Digital image processing (DIP) and deep learning (DL) algorithms can be employed to analyze the HIs of five different types of lung and colon tissues. Methods: Therefore, this paper proposes a refined DL model that integrates feature fusion for the multi-classification of lung and colon cancers. The proposed model incorporates three DL architectures: ResNet-101V2, NASNetMobile, and EfficientNet-B0. Each model has limitations concerning variations in the shape and texture of input images. To address this, the proposed model utilizes a concatenate layer to merge the pre-trained individual feature vectors from ResNet-101V2, NASNetMobile, and EfficientNet-B0 into a single feature vector, which is then fine-tuned. As a result, the proposed DL model achieves high success in multi-classification by leveraging the strengths of all three models to enhance overall accuracy. This model aims to assist pathologists in the early detection of lung and colon cancer with reduced effort, time, and cost. The proposed DL model was evaluated using the LC25000 dataset, which contains colon and lung HIs. The dataset was pre-processed using resizing and normalization techniques. Results: The model was tested and compared with recent DL models, achieving impressive results: 99.8% for precision, 99.8% for recall, 99.8% for F1-score, 99.96% for specificity, and 99.94% for accuracy. Conclusions: Thus, the proposed DL model demonstrates exceptional performance across all classification categories. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. The real-time dynamic liquid level calculation method of the sucker rod well based on multi-view features fusion.
- Author
-
Cheng-Zhe Yin, Kai Zhang, Jia-Yuan Liu, Xin-Yan Wang, Min Li, Li-Ming Zhang, and Wen-Sheng Zhou
- Subjects
- *
MISSING data (Statistics) , *OIL fields , *DYNAMOMETER , *LIQUIDS , *GENERALIZATION - Abstract
In the production of the sucker rod well, the dynamic liquid level is important for the production efficiency and safety in the lifting process. It is influenced by multi-source data which need to be combined for the dynamic liquid level real-time calculation. In this paper, the multi-source data are regarded as the different views including the load of the sucker rod and liquid in the wellbore, the image of the dynamometer card and production dynamics parameters. These views can be fused by the multi-branch neural network with special fusion layer. With this method, the features of different views can be extracted by considering the difference of the modality and physical meaning between them. Then, the extraction results which are selected by multinomial sampling can be the input of the fusion layer. During the fusion process, the availability under different views determines whether the views are fused in the fusion layer or not. In this way, not only the correlation between the views can be considered, but also the missing data can be processed automatically. The results have shown that the load and production features fusion (the method proposed in this paper) performs best with the lowest mean absolute error (MAE) 39.63 m, followed by the features concatenation with MAE 42.47 m. They both performed better than only a single view and the lower MAE of the features fusion indicates that its generalization ability is stronger. In contrast, the image feature as a single view contributes little to the accuracy improvement after fused with other views with the highest MAE. When there is data missing in some view, compared with the features concatenation, the multi-view features fusion will not result in the unavailability of a large number of samples. When the missing rate is 10%, 30%, 50% and 80%, the method proposed in this paper can reduce MAE by 5.8, 7, 9.3 and 20.3 m respectively. In general, the multi-view features fusion method proposed in this paper can improve the accuracy obviously and process the missing data effectively, which helps provide technical support for real-time monitoring of the dynamic liquid level in oil fields. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. Containment Control-Guided Boundary Information for Semantic Segmentation.
- Author
-
Liu, Wenbo, Zhang, Junfeng, Zhao, Chunyu, Huang, Yi, Deng, Tao, and Yan, Fei
- Subjects
COMPUTER vision ,ACCURACY of information ,SPEED ,DESIGN - Abstract
Real-time semantic segmentation is a challenging task in computer vision, especially in complex scenes. In this study, a novel three-branch semantic segmentation model is designed, aiming to effectively use boundary information to improve the accuracy of semantic segmentation. The proposed model introduces the concept of containment control in a pioneering way, which treats image interior elements as well as image boundary elements as followers and leaders in containment control, respectively. Based on this, we utilize two learnable feature fusion matrices in the high-level semantic information stage of the model to quantify the fusion process of internal and boundary features. Further, we design a dedicated loss function to update the parameters of the feature fusion matrices based on the criterion of containment control, which enables fine-grained communication between target features. In addition, our model incorporates a Feature Enhancement Unit (FEU) to tackle the challenge of maximizing the utility of multi-scale features essential for semantic segmentation tasks through the meticulous reconstruction of these features. The proposed model proves effective on the publicly available Cityscapes and CamVid datasets, achieving a trade-off between effectiveness and speed. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. Image Captioning with Masked Diffusion Model
- Author
-
Tian, Weidong, Xu, Wenzheng, Zhao, Junxiang, Zhao, Zhongqiu, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Huang, De-Shuang, editor, Chen, Wei, editor, and Pan, Yijie, editor
- Published
- 2024
- Full Text
- View/download PDF
7. Fault Diagnosis of Vessel Motor Bearing Based on Multi-feature Fusion in Time Domain and Frequency Domain
- Author
-
Xue, Zhengyu, Wang, Xin, Ma, Yi-fang, Wang, Zi-qi, Qiu, Chidong, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Jiming, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Oneto, Luca, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Tan, Kay Chen, Series Editor, Yang, Qingxin, editor, Li, Zewen, editor, and Luo, An, editor
- Published
- 2024
- Full Text
- View/download PDF
8. A dual path hybrid neural network framework for remaining useful life prediction of aero‐engine.
- Author
-
Lu, Xinhua, Pan, Haobo, Zhang, Lingxiao, Ma, Li, and Wan, Hui
- Subjects
- *
REMAINING useful life , *TURBOFAN engines , *COMPUTATIONAL complexity , *FEATURE extraction , *FORECASTING , *TIME series analysis - Abstract
Predicting the remaining useful life (RUL) of an engine is one of the key tasks of Prognostics and health management (PHM). Modern mechanical equipment typically operates in complex operating conditions and fault modes, leading to dispersed distribution of sensor data and challenges for feature extraction. To improve the accuracy of predicting the RUL under the complex scenarios, this paper proposes a multi‐scale convolutional network (CNN) and bidirectional gated recurrent unit (MSC‐BiGRU) mode under a dual path framework with temporal attention. Specifically, the multi‐scale CNN in the first path is to learn complex features, and Swish Activation function is used to improve the prediction ability of the network; the bidirectional gated recurrent unit (BiGRU) in the second path can handle both forward and backward time series, and adaptively capture the importance of outputs at different times using temporal attention, enhancing the model's feature extraction ability in the temporal dimension. A feature fusion mechanism is developed to connect two paths in parallel, overcoming the overfitting and high computational complexity in deep complex models. We verify the effectiveness of the proposed method using a simulated turbofan engine dataset, especially on datasets FD002 and FD004 under complex operating conditions and fault modes, the RMSE values were reduced by 17.37% and 9.97%, respectively, compared to the BiGRU‐TSAM. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. Attention guided approach for food type and state recognition.
- Author
-
Alahmari, Saeed S., Gardner, Michael R., and Salem, Tawfiq
- Subjects
- *
CONVOLUTIONAL neural networks , *COMPUTER vision , *DEEP learning , *IMAGE segmentation , *NUTRITIONAL requirements - Abstract
Advancements in computer vision have resulted in significant breakthroughs across various applications, and one notable area of progress is in the recognition of food ingredient types and states. The identification of food items, distinguishing between types like oranges or apples, and assessing their states, whether whole, peeled, sliced, or juiced, is a pivotal task with far-reaching implications for fields such as food safety, recipe analysis, and restaurant quality control. This paper introduces an innovative approach to food type and state recognition that capitalizes on attention mechanisms and incorporates mask fusion to improve the accuracy and robustness of the recognition process. We evaluate the proposed approach through quantitative and qualitative analyses and comparisons to previous methods. The results consistently demonstrate that our proposed approach, integrating attention mechanisms, outperforms baseline and state-of-the-art methods, achieving an accuracy of 87.11%. This achievement signifies a step forward in refining food image segmentation models and reinforces the applicability of advanced techniques in real-world scenarios. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. Human Gait Recognition by using Two Stream Neural Network along with Spatial and Temporal Features.
- Author
-
Mehmood, Asif, Amin, Javeria, Sharif, Muhammad, and Kadry, Seifedine
- Subjects
- *
GAIT in humans , *GENETIC algorithms , *OPTICAL flow , *FEATURE extraction - Abstract
• Design and train 55-layer CNN model on the CIFAR-100 dataset. • Build two-stream network to extract spatial and temporal features from the images. • Feature optimization is performed using a Genetic algorithm. Human Gait Recognition (HGR) is referred to as a biometric tactic that is broadly used for the recognition of an individual by using the pattern of walking. There are some key factors such as angle variation, clothing variation, foot shadows, and carrying conditions that affect the human gait. In this work, a new approach is proposed for the HGR that contains five major steps. In the first step, the video data is converted into image frames. In the second step, RGB to GRAY conversion is carried out. After that, a two-stream network is designed by using a 55-layer CNN model called CNN-55 trained on CIFAR-100. The CNN-55 is designed from scratch and trained on the CIFAR-100 dataset by selecting hyperparameters. This pre-trained CNN-55 is used to build a two-stream network. In Stream-1 the optical flow frames are obtained by Horn and Schunk algorithm. These frames are fed into a CNN-55 to extract temporal features. In Stream-2 the GRAY frames are fed to the CNN-55 model for extraction of spatial features. After that, both vectors are serially fused. In the fourth step, the fused feature vector is fed into the Genetic Algorithm for optimization. Finally, the feature vector is fed into the One-Versus-All SVM classifier for recognition. The system is tested on all CASIA-B angles such as 000, 180, 360, 540, 720, 900, 1080, 1260, 1440, 1620, and 1800 which provides accuracy of 97.10%, 96.80%, 94.60%, 98.0%, 98.30%, 96.80%, 97.60, 96.90%, 99.60%, 96.80%, and 97.60%, respectively. The proposed method produces better outcomes compared to recent techniques. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. Forest Fire Smoke Detection Based on Multiple Color Spaces Deep Feature Fusion.
- Author
-
Han, Ziqi, Tian, Ye, Zheng, Change, and Zhao, Fengjun
- Subjects
COLOR space ,FOREST fires ,WILDFIRE prevention ,SMOKE ,HUMAN ecology - Abstract
The drastic increase of forest fire occurrence, which in recent years has posed severe threat and damage worldwide to the natural environment and human society, necessitates smoke detection of the early forest fire. First, a semantic segmentation method based on multiple color spaces feature fusion is put forward for forest fire smoke detection. Considering that smoke images in different color spaces may contain varied and distinctive smoke features which are beneficial for improving the detection ability of a model, the proposed model integrates the function of multi-scale and multi-type self-adaptive weighted feature fusion with attention augmentation to extract the enriched and complementary fused features of smoke, utilizing smoke images from multi-color spaces as inputs. Second, the model is trained and evaluated on part of the FIgLib dataset containing high-quality smoke images from watchtowers in the forests, incorporating various smoke types and complex background conditions, with a satisfactory smoke segmentation result for forest fire detection. Finally, the optimal color space combination and the fusion strategy for the model is determined through elaborate and extensive experiments with a superior segmentation result of 86.14 IoU of smoke obtained. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. Predicting lncRNA–protein interactions through deep learning framework employing multiple features and random forest algorithm
- Author
-
Ying Liang, XingRui Yin, YangSen Zhang, You Guo, and YingLong Wang
- Subjects
LncRNA–protein interactions ,Multiple features ,Random forest algorithm ,Features fusion ,Computer applications to medicine. Medical informatics ,R858-859.7 ,Biology (General) ,QH301-705.5 - Abstract
Abstract RNA-protein interaction (RPI) is crucial to the life processes of diverse organisms. Various researchers have identified RPI through long-term and high-cost biological experiments. Although numerous machine learning and deep learning-based methods for predicting RPI currently exist, their robustness and generalizability have significant room for improvement. This study proposes LPI-MFF, an RPI prediction model based on multi-source information fusion, to address these issues. The LPI-MFF employed protein–protein interactions features, sequence features, secondary structure features, and physical and chemical properties as the information sources with the corresponding coding scheme, followed by the random forest algorithm for feature screening. Finally, all information was combined and a classification method based on convolutional neural networks is used. The experimental results of fivefold cross-validation demonstrated that the accuracy of LPI-MFF on RPI1807 and NPInter was 97.60% and 97.67%, respectively. In addition, the accuracy rate on the independent test set RPI1168 was 84.9%, and the accuracy rate on the Mus musculus dataset was 90.91%. Accordingly, LPI-MFF demonstrated greater robustness and generalization than other prevalent RPI prediction methods.
- Published
- 2024
- Full Text
- View/download PDF
13. A Novel Fusion Technique for Early Detection of Alopecia Areata Using ResNet-50 and CRSHOG
- Author
-
Haider Ali Khan and Syed M. Adnan
- Subjects
Alopecia areata ,feature extraction ,features fusion ,computer vision ,deep learning ,corner rhombus shape HOG (CRSHOG) ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Alopecia Areata is an autoimmune disorder where the body’s immune system attacks normal cells instead of intruders, leading to hair loss. If not detected early, it can progress to complete scalp baldness (Alopecia Totalis) or total body hair loss (Alopecia Universalis). Therefore, early detection of Alopecia Areata is crucial. Computer vision and deep learning techniques have been used for the last few years in the field of dermatology to detect different relevant diseases. We proposed a robust feature fusion technique, named AlopeciaDet for the timely detection of Alopecia Areata using camera images instead of dermoscopic images that require specialized equipment. AlopeciaDet combines Corner Rhombus Shape HOG (CRSHOG) features with those extracted from the ResNet-50 pre-trained model to detect Alopecia Areata with high accuracy by using the Dermenet dataset. The geometric properties of rhombus shapes make them useful for recognizing patterns in an image. HOG captures local object appearance and shape by computing the distribution of intensity gradients in localized portions of the image. We combined these characteristics into CRSHOG. Alopecia Areata, on the other hand, is characterized by distinctive patterns and shapes of hair loss, with the most common feature being round or oval-shaped patches. These patches can vary in size and usually have well-defined, sharp edges. Consequently, our proposed CRSHOG significantly improves the extraction of local information from images of affected areas. It achieves this by integrating sign and magnitude data, thereby enhancing discrimination capabilities for texture classification tasks. Finally, the magnitudes and directions of these pixel values are calculated. We achieved an accuracy of 99.45% with an error rate of 0.55% using Artificial Neural Network. These results surpass the accuracy of current state-of-the-art techniques in this field.
- Published
- 2024
- Full Text
- View/download PDF
14. A Unified Super-Resolution Framework of Remote-Sensing Satellite Images Classification Based on Information Fusion of Novel Deep Convolutional Neural Network Architectures
- Author
-
Hussain Mubarak Albarakati, Shams ur Rehman, Muhammad Attique Khan, Ameer Hamza, Junaid Aftab, Areej Alasiry, Mehrez Marzougui, Michele Nappi, and Yunyoung Nam
- Subjects
Augmentation ,classification ,custom deep model ,features fusion ,land cover ,optimization ,Ocean engineering ,TC1501-1800 ,Geophysics. Cosmic physics ,QC801-809 - Abstract
Land-use and land-cover (LULC) classification is an active research challenge in the area of remotely sensed satellite images due to critical applications, such as resource management and agriculture. Deep learning has recently shown a significant improvement in LULC classification using satellite images; however, complex and similar patterns of the images make the classification process more challenging. This article proposes a new information-fused framework for LULC classification from the remotely sensed imaging data. The proposed framework consists of two phases: training and testing. An augmentation process was conducted in the training phase to resolve the imbalance issue. In the next step, two novel convolutional neural network architectures are proposed based on six residual blocks named ResSAN6 and six inverted blocks named RS-IRSAN. The designed models are trained from scratch, whereas the hyperparameters are initialized using the Bayesian optimization algorithm. In the next phase, testing has been performed on the trained models. Testing set images were employed, and deep features from the self-attention layer were extracted. A novel mutual information-based serial fusion approach is proposed that combines both models' features. Also, the variation in the features is removed using median normalization. Furthermore, the feature fusion's computational time and precision rates are improved, which is further optimized using an arithmetic optimization (AO) algorithm. The best information features are selected and finally classified using a shallow wide neural network by employing AOrk. The experimental process of the proposed framework has been performed on three datasets, such as RSI-CB128, WHU-RS19, and NWPU_RESISC45, and achieved an accuracy of 95.7, 97.5, and 92.0%, respectively. Comparing the results with recent related works, the proposed framework shows improved accuracy and precision rates.
- Published
- 2024
- Full Text
- View/download PDF
15. Predicting lncRNA–protein interactions through deep learning framework employing multiple features and random forest algorithm
- Author
-
Liang, Ying, Yin, XingRui, Zhang, YangSen, Guo, You, and Wang, YingLong
- Published
- 2024
- Full Text
- View/download PDF
16. Prosperous Human Gait Recognition: an end-to-end system based on pre-trained CNN features selection.
- Author
-
Mehmood, Asif, Khan, Muhammad Attique, Sharif, Muhammad, Khan, Sajid Ali, Shaheen, Muhammad, Saba, Tanzila, Riaz, Naveed, and Ashraf, Imran
- Abstract
Human Gait Recognition (HGR) is a biometric approach, widely used for security purposes from the past few decades. In HGR, the change in an individual walk along with wearing clothes and carrying bag are major covariant controls which impact the performance of a system. Moreover, recognition under various view angles is another key challenge in HGR. In this work, a novel fully automated method is proposed for HGR under various view angles using deep learning. Four primary steps are involved such as: preprocessing of original video frames, exploiting pre-trained Densenet-201 CNN model for features extraction, reduction of additional features from extracted vector based on a hybrid selection method, and finally recognition using supervised learning methods. The extraction of CNN features is a key step in which our target is to extract the most active features. To achieve this goal, we fuse the features of both second last and third last layers in a parallel process. At a later stage, best features are selected by the Firefly algorithm and Skewness based approach. These selected features are serially combined and fed to One against All Multi Support Vector Machine (OAMSVM) for final recognition. Three different angles 18
0 , 360 and 540 of the CASIA B dataset are selected for the evaluation process and accuracy of 94.3%, 93.8% and 94.7% is achieved respectively. Results show significant improvement in accuracy and recall rate as compared to the existing state-of-the-art techniques. [ABSTRACT FROM AUTHOR]- Published
- 2024
- Full Text
- View/download PDF
17. Offline signature verification system: a novel technique of fusion of GLCM and geometric features using SVM.
- Author
-
Batool, Faiza Eba, Attique, Muhammad, Sharif, Muhammad, Javed, Kashif, Nazir, Muhammad, Abbasi, Aaqif Afzaal, Iqbal, Zeshan, and Riaz, Naveed
- Abstract
In the area of digital biometric systems, the handwritten signature plays a key role in the authentication of a person based on their original samples. In offline signature verification (OSV), several problems exist that are challenging for verification of authentic or forgery signature by the digital system. Correct signature verification improves the security of people, systems, and services. It is applied to uniquely identify an individual based on the motion of pen as up and down, signature speed, and shape of a loop. In this work, the multi-level features fusion and optimal features selection based automatic technique is proposed for OSV. For this purpose, twenty-two Gray Level Co-occurrences Matrix (GLCM) and eight geometric features are calculated from pre-processing signature samples. These features are fused by a new parallel approach which is based on a high-priority index feature (HPFI). A skewness-kurtosis based features selection approach is also proposed name skewness-kurtosis controlled PCA (SKcPCA) and selects the optimal features for final classification into forged and genuine signatures. MCYT, GPDS synthetic, and CEDAR datasets are utilized for validation of the proposed system and show enhancement in terms of Far and FRR as compared to existing methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. Human action recognition using fusion of multiview and deep features: an application to video surveillance.
- Author
-
Khan, Muhammad Attique, Javed, Kashif, Khan, Sajid Ali, Saba, Tanzila, Habib, Usman, Khan, Junaid Ali, and Abbasi, Aaqif Afzaal
- Abstract
Human Action Recognition (HAR) has become one of the most active research area in the domain of artificial intelligence, due to various applications such as video surveillance. The wide range of variations among human actions in daily life makes the recognition process more difficult. In this article, a new fully automated scheme is proposed for Human action recognition by fusion of deep neural network (DNN) and multiview features. The DNN features are initially extracted by employing a pre-trained CNN model name VGG19. Subsequently, multiview features are computed from horizontal and vertical gradients, along with vertical directional features. Afterwards, all features are combined in order to select the best features. The best features are selected by employing three parameters i.e. relative entropy, mutual information, and strong correlation coefficient (SCC). Furthermore, these parameters are used for selection of best subset of features through a higher probability based threshold function. The final selected features are provided to Naive Bayes classifier for final recognition. The proposed scheme is tested on five datasets name HMDB51, UCF Sports, YouTube, IXMAS, and KTH and the achieved accuracy were 93.7%, 98%, 99.4%, 95.2%, and 97%, respectively. Lastly, the proposed method in this article is compared with existing techniques. The resuls shows that the proposed scheme outperforms the state of the art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
19. PEA-YOLO: a lightweight network for static gesture recognition combining multiscale and attention mechanisms.
- Author
-
Zhou, Weina and Li, Xile
- Abstract
Gesture recognition has been widely used in many human–computer interaction applications, which is one of the most intuitive and natural ways for humans to communicate with computers. However, it remains a challenging problem due to the interference such as variety of backgrounds, hand similar object, and lighting changes. In this article, a lightweight static gesture recognition network, named as PEA-YOLO, was put forward. The network adopts the idea of adaptive spatial feature pyramid and combines the attention mechanism and multi-path feature fusion method to improve the localization and recognition performance of gesture features. First, Efficient Channel Attention module was added after the backbone network to focus the model's attention on the gesture. Second, Feature Pyramid Network was replaced by Path Aggregation Network to localize the gesture better. Finally, Adaptive Spatial Feature Fusion module was added before the Yolo head to further reduce false detections rate in gesture recognition. The experiments conducted on the OUHANDS and NUSII datasets show that PEA-YOLO could achieve favorable performance with only 8.57 M parameters in static gesture recognition. Compared with other state of the arts, the proposed lightweight network has obtained a highest accuracy with a much high speed and few parameters. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
20. Intelligent fusion-assisted skin lesion localization and classification for smart healthcare.
- Author
-
Khan, Muhammad Attique, Muhammad, Khan, Sharif, Muhammad, Akram, Tallha, and Kadry, Seifedine
- Subjects
- *
CONVOLUTIONAL neural networks , *INFORMATION technology , *DISTRIBUTION (Probability theory) , *IMAGE segmentation , *MARGINAL distributions , *MACHINE learning - Abstract
With the rapid development of information technology, the conception of smart healthcare has progressively come to the fore. Smart healthcare utilizes next-generation technologies, such as artificial intelligence, the Internet of Things (IoT), big data and cloud computing to transform intelligently the existing medical system-making it more efficient, more reliable, and personalized. In this work, skin data are collected using dedicated hardware from mobile health units-working as nodes. The collected samples are uploaded to the cloud for further processing using a novel multi-modal information fusion framework, which performs skin lesion segmentation, followed by classification. The proposed framework has two main functional blocks: Segmentation and classification. In each block, we have a performance booster, which works on the principle of information fusion. For lesion segmentation, a hybrid framework is proposed, which utilizes the complementary strengths of two convolutional neural network (CNN) architectures to generate the segmented images. The resultant binary images are later fused using joint probability distribution and marginal distribution function. For lesion classification, a 30-layered CNN architecture is designed, which is trained on the HAM10000 dataset. A novel summation discriminant correlation analysis technique is used to fuse the extracted features from two fully connected layers. To avoid feature redundancy, a feature selection method "Regular Falsi" is developed, which down samples the extracted features into the lower dimensions. The selected features are finally classified using an extreme learning machine classifier. Five skin benchmark datasets (ISBI2016, ISIC2017, ISBI2018, ISIC2019, and HAM10000) are used to evaluate both segmentation and classification frameworks using average accuracy, false-negative rate, sensitivity, and computational time, whose results are impressive compared to state-of-the-art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
21. Global vision object detection using an improved Gaussian Mixture model based on contour.
- Author
-
Lei Sun
- Abstract
Object detection plays an important role in the field of computer vision. The purpose of object detection is to identify the objects of interest in the image and determine their categories and positions. Object detection has many important applications in various fields. This article addresses the problems of unclear foreground contour in moving object detection and excessive noise points in the global vision, proposing an improved Gaussian mixture model for feature fusion. First, the RGB image was converted into the HSV space, and a mixed Gaussian background model was established. Next, the object area was obtained through background subtraction, residual interference in the foreground was removed using the median filtering method, and morphological processing was performed. Then, an improved Canny algorithm using an automatic threshold from the Otsu method was used to extract the overall object contour. Finally, feature fusion of edge contours and the foreground area was performed to obtain the final object contour. The experimental results show that this method improves the accuracy of the object contour and reduces noise in the object. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. A double transformer residual super-resolution network for cross-resolution person re-identification
- Author
-
Fuzhen Zhu, Ce Sun, Chen Wang, and Bing Zhu
- Subjects
Person re-identification ,Super resolution ,Transformer network ,Features fusion ,Geodesy ,QB275-343 - Abstract
Cross-resolution person re-identification is a challenging problem in the field of person re-identification. In order to solve the problem of resolution mismatch, many studies introduce super-resolution into person re-identification tasks. In this work, we propose a cross-resolution person re-identification method based on double transformer residual super-resolution network (DTRSR), which mainly includes super-resolution module and person re-identification module. In the super-resolution module, we propose the double transformer network as our attention module. First of all, we divide the features extracted by the residual network. Then calculate the similarity between each local feature and the global feature after average pooling and maximum pooling respectively, which makes our module quickly capture the hidden weight information in the spatial domain. In the person re-identification module, we propose an effective fusion method based on key point features (KPFF). The key point extraction model can not only solve the problem that local features can not be accurately aligned, but also remove the interference of background noise. In order to fully mine the relationship between the features of each key point, we calculate the two-way correlation between each key point feature and other features, and then superimpose the two-way correlation with the feature itself to get the superposition feature which contains global and local information. The effectiveness of this method is proved by extensive experiments. Compared with the most advanced methods, the test results in the three datasets show that our method improves rank-1 by 1.1%, 3.5% and 1.7%; and rank-5 by 1.3%, 1.7% and 0.3%; and rank-10 by 0.1%, 0.4% and 0.1%, respectively.
- Published
- 2023
- Full Text
- View/download PDF
23. Containment Control-Guided Boundary Information for Semantic Segmentation
- Author
-
Wenbo Liu, Junfeng Zhang, Chunyu Zhao, Yi Huang, Tao Deng, and Fei Yan
- Subjects
semantic segmentation ,containment control ,features fusion ,Technology ,Engineering (General). Civil engineering (General) ,TA1-2040 ,Biology (General) ,QH301-705.5 ,Physics ,QC1-999 ,Chemistry ,QD1-999 - Abstract
Real-time semantic segmentation is a challenging task in computer vision, especially in complex scenes. In this study, a novel three-branch semantic segmentation model is designed, aiming to effectively use boundary information to improve the accuracy of semantic segmentation. The proposed model introduces the concept of containment control in a pioneering way, which treats image interior elements as well as image boundary elements as followers and leaders in containment control, respectively. Based on this, we utilize two learnable feature fusion matrices in the high-level semantic information stage of the model to quantify the fusion process of internal and boundary features. Further, we design a dedicated loss function to update the parameters of the feature fusion matrices based on the criterion of containment control, which enables fine-grained communication between target features. In addition, our model incorporates a Feature Enhancement Unit (FEU) to tackle the challenge of maximizing the utility of multi-scale features essential for semantic segmentation tasks through the meticulous reconstruction of these features. The proposed model proves effective on the publicly available Cityscapes and CamVid datasets, achieving a trade-off between effectiveness and speed.
- Published
- 2024
- Full Text
- View/download PDF
24. 基于 2D DWT 与 MobileNetV3 融合的轻量级茶叶病害识别.
- Author
-
黄铝文, 关非凡, 谦博, 侯闳耀, 刘迎庆, and 李雯敏
- Subjects
- *
IMAGE recognition (Computer vision) , *TEA - Abstract
Diseases have posed the serious threaten on the yield and quality of tea production. An accurate and rapid recognition of leaf diseases is essential to the instant diseases prevention of tea plantation. Deep learning can be expected to realize a rapid and accurate identification of tea diseases in natural environment with the advantages of low cost and high efficiency, compared with typical disease diagnosis. However, the previous models have much more parameters and computational complexity for the leaf diseases diagnosis. Furthermore, the lightweight models cannot fully meet the finegrained feature extraction. In this study, a disease recognition network (CBAM-TealeafNet) was proposed to extract the frequency features by the 2D discrete wavelet transform (2D DWT) and depth features by the bneck structure. Frequency features were then decomposed to suppress the high-frequency components. The fused feature module was used to reduce the impact of noise on the features for the features enhancement. CBAM (convolutional block attention module) was embedded to improve the feature extraction capability in the bneck structure. The weights were allocated into the feature channels and spatial position features of diseases. The function of focal loss was employed to replace the primitive cross-entropy loss, in order to better resolve the imbalance influences on sample class for the high accuracies. Totally, 3, 260 disease images of Shaanxi Tea No.1 and Longjing No.43 were captured, including five tea disease categories: gloeosporium theae-sinensis miyake, colletotrichum camelliae massee, cercospora theae breadade haan, exobasidium vexans masse, and phyllosticta theicola petch. The real environment was also simulated to evaluate the datasets. The images were then enhanced. Experiments were carried out to validate the optimal model structure and the improvement analysis of each component. The model was optimized for the hyperparameters setting. The final optimal learning rate was 0.000 5, which was derived from an initial learning rate range of 0.000 05-0.005. In addition, the whole recognition structure and the base model structure of MobileNetV3 were optimized to determine the optimal number of fusion layers and the fusion ratio on frequency and depth feature channels. The results showed that the CBAM-TealeafNet model was achieved in the higher accuracy on the tea disease recognition, compared with the previous models. The number of parameters was ranked secondly last only to MCA-MobileNet model. The CBAM-TealeafNet model increased the accuracy by 2.15%, whereas, the number of parameters decreased by 25.12%, compared with the NobiNetV3. Misidentification images and confusion matrix indicated that the CBAM-TealeafNet shared the better performance to highly distinguish between foreground and background, thus greatly improving the situation of disease confusion. In addition, the functions of cross-entropy and focal loss were compared to verify the accuracy of recognition on the dataset imbalance. Moreover, the CBAM model performed the superior to the SENet and ECANet, in terms of performance improvement. The CBAM-TealeafNet was employed to recognize the tea diseases. An accuracy of 98.70% and a F1-Score of 98.69% were achieved with the parameter number of 3.16×106 and FLOPs (floating-point operations) of 4.5×108 . The CBAMTealeafNet can be expected to effectively identify the diseases under the complicated environment, particulary with the characters of less parameter memory and higher inference speed. Misidentification of CBAM-TealeafNet will be reduced in future investigation. This finding can also provide a strong reference for the model construction on the recognition of common tea leaf diseases. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
25. A double transformer residual super-resolution network for cross-resolution person re-identification.
- Author
-
Zhu, Fuzhen, Sun, Ce, Wang, Chen, and Zhu, Bing
- Abstract
Cross-resolution person re-identification is a challenging problem in the field of person re-identification. In order to solve the problem of resolution mismatch, many studies introduce super-resolution into person re-identification tasks. In this work, we propose a cross-resolution person re-identification method based on double transformer residual super-resolution network (DTRSR), which mainly includes super-resolution module and person re-identification module. In the super-resolution module, we propose the double transformer network as our attention module. First of all, we divide the features extracted by the residual network. Then calculate the similarity between each local feature and the global feature after average pooling and maximum pooling respectively, which makes our module quickly capture the hidden weight information in the spatial domain. In the person re-identification module, we propose an effective fusion method based on key point features (KPFF). The key point extraction model can not only solve the problem that local features can not be accurately aligned, but also remove the interference of background noise. In order to fully mine the relationship between the features of each key point, we calculate the two-way correlation between each key point feature and other features, and then superimpose the two-way correlation with the feature itself to get the superposition feature which contains global and local information. The effectiveness of this method is proved by extensive experiments. Compared with the most advanced methods, the test results in the three datasets show that our method improves rank-1 by 1.1%, 3.5% and 1.7%; and rank-5 by 1.3%, 1.7% and 0.3%; and rank-10 by 0.1%, 0.4% and 0.1%, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
26. BF2SkNet: best deep learning features fusion-assisted framework for multiclass skin lesion classification.
- Author
-
Ajmal, Muhammad, Khan, Muhammad Attique, Akram, Tallha, Alqahtani, Abdullah, Alhaisoni, Majed, Armghan, Ammar, Althubiti, Sara A., and Alenezi, Fayadh
- Subjects
- *
DEEP learning , *CONVOLUTIONAL neural networks , *MACHINE learning , *DATA augmentation , *MYXOMYCETES , *ENTROPY - Abstract
The convolutional neural network showed considerable success in medical imaging with explainable AI for cancer detection and recognition. However, the irrelevant and large number of features increases the computational time and decreases the accuracy. This work proposes a deep learning and fuzzy entropy slime mould algorithm-based architecture for multiclass skin lesion classification. In the first step, we employed the data augmentation technique to increase the training data and further utilized it for training two fine-tuned deep learning models such as Inception-ResNetV2 and NasNet Mobile. Then, we used transfer learning on augmented datasets to train both models and obtained two feature vectors from newly fine-tuned models. Later, we applied a fuzzy entropy slime mould algorithm on both vectors to get optimal features that are finally fused using the Serial-Threshold fusion technique and classified using several machine learning classifiers. Eventually, the explainable AI technique named Gradcam opted for the visualization of the lesion region. The experimental process was conducted on two datasets, such as HAM10000 and ISIC 2018, and achieved 97.1 and 90.2% accuracy, better than the other techniques. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
27. TCM-Net: Mixed Global–Local Learning for Salient Object Detection in Optical Remote Sensing Images.
- Author
-
He, Junkang, Zhao, Lin, Hu, Wenjing, Zhang, Guoyun, Wu, Jianhui, and Li, Xinping
- Subjects
- *
OPTICAL remote sensing , *TRANSFORMER models - Abstract
Deep-learning methods have made significant progress for salient object detection in optical remote sensing images (ORSI-SOD). However, it is difficult for existing methods to effectively exploit both the multi-scale global context and local detail features due to the cluttered background and different scales that characterize ORSIs. To solve the problem, we propose a transformer and convolution mixed network (TCM-Net), with a U-shaped codec architecture for ORSI-SOD. By using a dual-path complementary network, we obtain both the global context and local detail information from the ORSIs of different resolution. A local and global features fusion module was developed to integrate the information at corresponding decoder layers. Furthermore, an attention gate module was designed to refine features while suppressing noise at each decoder layer. Finally, we tailored a hybrid loss function to our network structure, which incorporates three supervision strategies: global, local and output. Extensive experiments were conducted on three common datasets, and TCM-Net outperforms 17 state-of-the-art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
28. Optical character recognition (OCR) using partial least square (PLS) based feature reduction: an application to artificial intelligence for biometric identification
- Author
-
Akhtar, Zainab, Lee, Jong Weon, Attique Khan, Muhammad, Sharif, Muhammad, Ali Khan, Sajid, and Riaz, Naveed
- Published
- 2023
- Full Text
- View/download PDF
29. SkinNet-INIO: Multiclass Skin Lesion Localization and Classification Using Fusion-Assisted Deep Neural Networks and Improved Nature-Inspired Optimization Algorithm.
- Author
-
Hussain, Muneezah, Khan, Muhammad Attique, Damaševičius, Robertas, Alasiry, Areej, Marzougui, Mehrez, Alhaisoni, Majed, and Masood, Anum
- Subjects
- *
ARTIFICIAL neural networks , *OPTIMIZATION algorithms , *MACHINE learning , *ARTIFICIAL intelligence , *DEEP learning - Abstract
Background: Using artificial intelligence (AI) with the concept of a deep learning-based automated computer-aided diagnosis (CAD) system has shown improved performance for skin lesion classification. Although deep convolutional neural networks (DCNNs) have significantly improved many image classification tasks, it is still difficult to accurately classify skin lesions because of a lack of training data, inter-class similarity, intra-class variation, and the inability to concentrate on semantically significant lesion parts. Innovations: To address these issues, we proposed an automated deep learning and best feature selection framework for multiclass skin lesion classification in dermoscopy images. The proposed framework performs a preprocessing step at the initial step for contrast enhancement using a new technique that is based on dark channel haze and top–bottom filtering. Three pre-trained deep learning models are fine-tuned in the next step and trained using the transfer learning concept. In the fine-tuning process, we added and removed a few additional layers to lessen the parameters and later selected the hyperparameters using a genetic algorithm (GA) instead of manual assignment. The purpose of hyperparameter selection using GA is to improve the learning performance. After that, the deeper layer is selected for each network and deep features are extracted. The extracted deep features are fused using a novel serial correlation-based approach. This technique reduces the feature vector length to the serial-based approach, but there is little redundant information. We proposed an improved anti-Lion optimization algorithm for the best feature selection to address this issue. The selected features are finally classified using machine learning algorithms. Main Results: The experimental process was conducted using two publicly available datasets, ISIC2018 and ISIC2019. Employing these datasets, we obtained an accuracy of 96.1 and 99.9%, respectively. Comparison was also conducted with state-of-the-art techniques and shows the proposed framework improved accuracy. Conclusions: The proposed framework successfully enhances the contrast of the cancer region. Moreover, the selection of hyperparameters using the automated techniques improved the learning process of the proposed framework. The proposed fusion and improved version of the selection process maintains the best accuracy and shorten the computational time. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
30. Coal Flow Foreign Body Classification Based on ESCBAM and Multi-Channel Feature Fusion.
- Author
-
Kou, Qiqi, Ma, Haohui, Xu, Jinyang, Jiang, He, and Cheng, Deqiang
- Subjects
- *
FOREIGN bodies , *CONVEYOR belts , *BELT conveyors , *COAL , *FEATURE extraction , *COMPUTATIONAL complexity , *MULTICHANNEL communication - Abstract
Foreign bodies often cause belt scratching and tearing, coal stacking, and plugging during the transportation of coal via belt conveyors. To overcome the problems of large parameters, heavy computational complexity, low classification accuracy, and poor processing speed in current classification networks, a novel network based on ESCBAM and multichannel feature fusion is proposed in this paper. Firstly, to improve the utilization rate of features and the network's ability to learn detailed information, a multi-channel feature fusion strategy was designed to fully integrate the independent feature information between each channel. Then, to reduce the computational amount while maintaining excellent feature extraction capability, an information fusion network was constructed, which adopted the depthwise separable convolution and improved residual network structure as the basic feature extraction unit. Finally, to enhance the understanding ability of image context and improve the feature performance of the network, a novel ESCBAM attention mechanism with strong generalization and portability was constructed by integrating space and channel features. The experimental results demonstrate that the proposed method has the advantages of fewer parameters, low computational complexity, high accuracy, and fast processing speed, which can effectively classify foreign bodies on the belt conveyor. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
31. Environmental Sound Classification Based on Attention Feature Fusion and Improved Residual Network.
- Author
-
Jinfang Zeng, Liu, Yuxing, Wang, Mengjiao, and Zhang, Xin
- Abstract
The classification of environmental sound is an important research area in artificial intelligence and its classification accuracy is greatly affected by feature extraction. However, most existing methods for feature set generation use simple feature fusion methods, which are ineffective for multi-classification purposes. To solve this problem and improve the neural network classification performance of existing training environmental sound classification (ESC) tasks, we first add the Gaussian error linear unit (GELU) activation function and gated linear units (GLU) to the residual network, which improves the network's stability. Subsequently, this paper proposes a feature fusion method based on the attention mechanism and employs squeeze-and-excitation networks (SENet) to make network learning features fusion and training more successfully, which offers obvious advantages over existing classification methods. Experimental results show that our model has reached an obvious increase in classification accuracy for the two datasets i.e. ESC-10 (98.27%) and ESC-50 (98.32%). [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
32. Multi-Model Fusion of CNNs for Identification of Parkinson’s Disease Using Handwritten Samples
- Author
-
Saeeda Naz, Iqra Kamran, Sarah Gul, Fazle Hadi, and Fahmi Khalifa
- Subjects
Parkinson’s disease ,CNN ,ensemble learning ,features fusion ,SVM ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
When approximately seventy percent of dopamine-producing nerve cells cease to function normally, the symptoms of Parkinson’s disease (PD) manifest, marking an irreversible decline in nerve cell health. In clinical settings, neurologists assess individuals by observing their performance in carrying out certain tasks, including writing, drawing, walking, speaking, and assessing facial expressions for any difficulties. This paper focuses on the problem of early PD identification through handwriting and drawing tasks, and by using three well-known PD data-sets. Given the scarcity of handwriting samples and the wide spectrum of Parkinson’s disease symptoms, the challenge is known to be particularly difficult. To achieve reliable PD detection, we employ diverse data augmentation techniques to expand the dataset size. Then, we deploy and train the different architectures of deep Convolutional Neural Network (CNN) each of which extract different salient features and aspect of input data due to its unique layout and structure (i.e., number of layers, kernels, normalization, number of connected layers, etc.). After experimental analysis of the performance of individual CNNs, we selected the promising feature vectors and employed different early fusion strategies before final classification. This is a very useful technique that allows a classification model to learn and detect from various representations of data provided by multiple CNNs and improve the overall system performance. Experimental results show that the fusion of freeze features of multiple deep CNN models significantly achieves better exactness of 99.35% in comparison to uni model CNN and other state-of-the-art work.
- Published
- 2023
- Full Text
- View/download PDF
33. Features Fusion Framework for Multimodal Irregular Time-series Events
- Author
-
Tang, Peiwang, Zhang, Xianchao, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Khanna, Sankalp, editor, Cao, Jian, editor, Bai, Quan, editor, and Xu, Guandong, editor
- Published
- 2022
- Full Text
- View/download PDF
34. Finger Trimodal Features Coding Fusion Method
- Author
-
Wen, Mengna, Ye, Ziyun, Yang, Jinfeng, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Deng, Weihong, editor, Feng, Jianjiang, editor, Huang, Di, editor, Kan, Meina, editor, Sun, Zhenan, editor, Zheng, Fang, editor, Wang, Wenfeng, editor, and He, Zhaofeng, editor
- Published
- 2022
- Full Text
- View/download PDF
35. GCMK: Detecting Spam Movie Review Based on Graph Convolutional Network Embedding Movie Background Knowledge
- Author
-
Cao, Hao, Li, Hanyue, He, Yulin, Yan, Xu, Yang, Fei, Wang, Haizhou, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Pimenidis, Elias, editor, Angelov, Plamen, editor, Jayne, Chrisina, editor, Papaleonidas, Antonios, editor, and Aydin, Mehmet, editor
- Published
- 2022
- Full Text
- View/download PDF
36. A Comparative Analysis of Optimization Algorithms for Gastrointestinal Abnormalities Recognition and Classification Based on Ensemble XcepNet23 and ResNet18 Features.
- Author
-
Naz, Javeria, Sharif, Muhammad Imran, Sharif, Muhammad Irfan, Kadry, Seifedine, Rauf, Hafiz Tayyab, and Ragab, Adham E.
- Subjects
OPTIMIZATION algorithms ,FEATURE extraction ,PARTICLE swarm optimization ,DEEP learning ,GASTROINTESTINAL diseases - Abstract
Esophagitis, cancerous growths, bleeding, and ulcers are typical symptoms of gastrointestinal disorders, which account for a significant portion of human mortality. For both patients and doctors, traditional diagnostic methods can be exhausting. The major aim of this research is to propose a hybrid method that can accurately diagnose the gastrointestinal tract abnormalities and promote early treatment that will be helpful in reducing the death cases. The major phases of the proposed method are: Dataset Augmentation, Preprocessing, Features Engineering (Features Extraction, Fusion, Optimization), and Classification. Image enhancement is performed using hybrid contrast stretching algorithms. Deep Learning features are extracted through transfer learning from the ResNet18 model and the proposed XcepNet23 model. The obtained deep features are ensembled with the texture features. The ensemble feature vector is optimized using the Binary Dragonfly algorithm (BDA), Moth–Flame Optimization (MFO) algorithm, and Particle Swarm Optimization (PSO) algorithm. In this research, two datasets (Hybrid dataset and Kvasir-V1 dataset) consisting of five and eight classes, respectively, are utilized. Compared to the most recent methods, the accuracy achieved by the proposed method on both datasets was superior. The Q_SVM's accuracies on the Hybrid dataset, which was 100%, and the Kvasir-V1 dataset, which was 99.24%, were both promising. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
37. A Robust Brain Tumor Detector Using BiLSTM and Mayfly Optimization and Multi-Level Thresholding.
- Author
-
Mahum, Rabbia, Sharaf, Mohamed, Hassan, Haseeb, Liang, Lixin, and Huang, Bingding
- Subjects
BRAIN tumors ,OPTIMIZATION algorithms ,DETECTORS ,CELL growth ,NEURAL development - Abstract
A brain tumor refers to an abnormal growth of cells in the brain that can be either benign or malignant. Oncologists typically use various methods such as blood or visual tests to detect brain tumors, but these approaches can be time-consuming, require additional human effort, and may not be effective in detecting small tumors. This work proposes an effective approach to brain tumor detection that combines segmentation and feature fusion. Segmentation is performed using the mayfly optimization algorithm with multilevel Kapur's threshold technique to locate brain tumors in MRI scans. Key features are achieved from tumors employing Histogram of Oriented Gradients (HOG) and ResNet-V2, and a bidirectional long short-term memory (BiLSTM) network is used to classify tumors into three categories: pituitary, glioma, and meningioma. The suggested methodology is trained and tested on two datasets, Figshare and Harvard, achieving high accuracy, precision, recall, F1 score, and area under the curve (AUC). The results of a comparative analysis with existing DL and ML methods demonstrate that the proposed approach offers superior outcomes. This approach has the potential to improve brain tumor detection, particularly for small tumors, but further validation and testing are needed before clinical use. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
38. GaitDONet: Gait Recognition Using Deep Features Optimization and Neural Network.
- Author
-
Khan, Muhammad Attique, Khan, Awais, Alhaisoni, Majed, Alqahtani, Abdullah, Armghan, Ammar, Althubiti, Sara A., Alenezi, Fayadh, Senghour Mey, and Nam, Yunyoung
- Subjects
GAIT in humans ,FEATURE extraction ,OPTIMIZATION algorithms ,DEEP learning ,INTERVAL analysis ,MOBULIDAE - Abstract
Human gait recognition (HGR) is the process of identifying a subject (human) based on their walking pattern. Each subject is a unique walking pattern and cannot be simulated by other subjects. But, gait recognition is not easy and makes the system difficult if any object is carried by a subject, such as a bag or coat. This article proposes an automated architecture based on deep features optimization for HGR. To our knowledge, it is the first architecture in which features are fused using multiset canonical correlation analysis (MCCA). In the proposed method, original video frames are processed for all 11 selected angles of the CASIA B dataset and utilized to train two fine-tuned deep learning models such as Squeezenet and Efficientnet. Deep transfer learning was used to train both fine-tuned models on selected angles, yielding two new targeted models that were later used for feature engineering. Features are extracted from the deep layer of both fine-tuned models and fused into one vector using MCCA. An improved manta ray foraging optimization algorithm is also proposed to select the best features from the fused feature matrix and classified using a narrow neural network classifier. The experimental process was conducted on all 11 angles of the large multi-view gait dataset (CASIA B) dataset and obtained improved accuracy than the state-of-the-art techniques. Moreover, a detailed confidence interval based analysis also shows the effectiveness of the proposed architecture for HGR. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
39. Learning spatiotemporal features of DSA using 3D CNN and BiConvGRU for ischemic moyamoya disease detection.
- Author
-
Hu, Tao, Lei, Yu, Su, Jiabin, Yang, Heng, Ni, Wei, Gao, Chao, Yu, Jinhua, Wang, Yuanyuan, and Gu, Yuxiang
- Subjects
- *
CONVOLUTIONAL neural networks , *MOYAMOYA disease , *FEATURE extraction , *DIGITAL subtraction angiography , *CEREBRAL hemorrhage - Abstract
Moyamoya disease (MMD) is a serious intracranial cerebrovascular disease. Cerebral hemorrhage caused by MMD will bring life risk to patients. Therefore, MMD detection is of great significance in the prevention of cerebral hemorrhage. In order to improve the accuracy of digital subtraction angiography (DSA) in the diagnosis of ischemic MMD, in this paper, a deep network architecture combined with 3D convolutional neural network (3D CNN) and bidirectional convolutional gated recurrent unit (BiConvGRU) is proposed to learn the spatiotemporal features for ischemic MMD detection. Firstly, 2D convolutional neural network (2D CNN) is utilized to extract spatial features for each frame of DSA. Secondly, the long-term spatiotemporal features of DSA sequence are extracted by BiConvGRU. Thirdly, the short-term spatiotemporal features of DSA are further extracted by 3D convolutional neural network (3D CNN). In addition, different features are extracted when gray images and optical flow images pass through the network, and multiple features are extracted by features fusion. Finally, the fused features are utilized to classify. The proposed method was quantitatively evaluated on a data sets of 630 cases. The experimental results showed a detection accuracy of 0.9788, sensitivity and specificity were 0.9780 and 0.9796, respectively, and area under curve (AUC) was 0.9856. Compared with other methods, we can get the highest accuracy and AUC. The experimental results show that the proposed method is stable and reliable for ischemic MMD detection, which provides an option for doctors to accurately diagnose ischemic MMD. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
40. BRMI-Net: Deep Learning Features and Flower Pollination-Controlled Regula Falsi-Based Feature Selection Framework for Breast Cancer Recognition in Mammography Images.
- Author
-
Rehman, Shams ur, Khan, Muhamamd Attique, Masood, Anum, Almujally, Nouf Abdullah, Baili, Jamel, Alhaisoni, Majed, Tariq, Usman, and Zhang, Yu-Dong
- Subjects
- *
DEEP learning , *FEATURE selection , *CANCER education , *WOMEN'S mortality , *FEATURE extraction , *IMAGE recognition (Computer vision) - Abstract
The early detection of breast cancer using mammogram images is critical for lowering women's mortality rates and allowing for proper treatment. Deep learning techniques are commonly used for feature extraction and have demonstrated significant performance in the literature. However, these features do not perform well in several cases due to redundant and irrelevant information. We created a new framework for diagnosing breast cancer using entropy-controlled deep learning and flower pollination optimization from the mammogram images. In the proposed framework, a filter fusion-based method for contrast enhancement is developed. The pre-trained ResNet-50 model is then improved and trained using transfer learning on both the original and enhanced datasets. Deep features are extracted and combined into a single vector in the following phase using a serial technique known as serial mid-value features. The top features are then classified using neural networks and machine learning classifiers in the following stage. To accomplish this, a technique for flower pollination optimization with entropy control has been developed. The exercise used three publicly available datasets: CBIS-DDSM, INbreast, and MIAS. On these selected datasets, the proposed framework achieved 93.8, 99.5, and 99.8% accuracy, respectively. Compared to the current methods, the increase in accuracy and decrease in computational time are explained. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
41. 基于时空特征融合的网络流量预测模型.
- Author
-
薛自杰, 卢昱妃, 宁 芊, 黄霖宇, and 陈炳才
- Subjects
FEATURE extraction ,NETWORK performance ,CHANNEL coding - Abstract
Copyright of Journal of Harbin Institute of Technology. Social Sciences Edition / Haerbin Gongye Daxue Xuebao. Shehui Kexue Ban is the property of Harbin Institute of Technology and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2023
- Full Text
- View/download PDF
42. 采用双支路与特征融合网络的路沿分割.
- Author
-
孙扬, 韩磊, 王程庆, and 李韵鹏
- Abstract
Copyright of Journal of Computer Engineering & Applications is the property of Beijing Journal of Computer Engineering & Applications Journal Co Ltd. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2023
- Full Text
- View/download PDF
43. A Feature Fusion Model with Data Augmentation for Speech Emotion Recognition.
- Author
-
Tu, Zhongwen, Liu, Bin, Zhao, Wei, Yan, Raoxin, and Zou, Yang
- Subjects
EMOTION recognition ,DATA augmentation ,FEATURE selection ,FEATURE extraction ,RECURRENT neural networks ,AUTOMATIC speech recognition ,STOCHASTIC learning models ,SPEECH synthesis - Abstract
The Speech Emotion Recognition (SER) algorithm, which aims to analyze the expressed emotion from a speech, has always been an important topic in speech acoustic tasks. In recent years, the application of deep-learning methods has made great progress in SER. However, the small scale of the emotional speech dataset and the lack of effective emotional feature representation still limit the development of research. In this paper, a novel SER method, combining data augmentation, feature selection and feature fusion, is proposed. First, aiming at the problem that there are inadequate samples in the speech emotion dataset and the number of samples in each category is unbalanced, a speech data augmentation method, Mix-wav, is proposed which is applied to the audio of the same emotion category. Then, on the one hand, a Multi-Head Attention mechanism-based Convolutional Recurrent Neural Network (MHA-CRNN) model is proposed to further extract the spectrum vector from the Log-Mel spectrum. On the other hand, Light Gradient Boosting Machine (LightGBM) is used for feature set selection and feature dimensionality reduction in four emotion global feature sets, and more effective emotion statistical features are extracted for feature fusion with the previously extracted spectrum vector. Experiments are carried out on the public dataset Interactive Emotional Dyadic Motion Capture (IEMOCAP) and Chinese Hierarchical Speech Emotion Dataset of Broadcasting (CHSE-DB). The experiments show that the proposed method achieves 66.44% and 93.47% of the unweighted average test accuracy, respectively. Our research shows that the global feature set after feature selection can supplement the features extracted by a single deep-learning model through feature fusion to achieve better classification accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
44. Correlating Edge with Parsing for Human Parsing.
- Author
-
Gong, Kai, Wang, Xiuying, and Tan, Shoubiao
- Subjects
PARSING (Computer grammar) ,FEATURE extraction ,COMPUTER vision ,VISUAL fields ,HUMAN body ,HUMAN beings - Abstract
Human parsing has great application prospects in the field of computer vision, but there are still many problems. In the existing algorithms, the problems of small-scale target location and the problem of background occlusion have not been fully resolved, which will lead to wrong segmentation or incomplete segmentation. Compared with the existing practice of feature concatenation, using the correlation between two factors can make full use of edge information for refined parsing. This paper proposes the mechanism of correlation edge and parsing network (MCEP), which uses the spatial aware and two max-pooling (SMP) module to capture the correlation. The structure mainly includes two steps, namely (1) collection operation, where, through the mutual promotion of edge features and parsing features, more attention is paid to the region of interest around the edge of the human body, and the spatial clues of the human body are collected adaptively, and (2) filtering operation, where parallel max-pooling is adopted to solve the background occlusion problem. Meanwhile, semantic context feature extraction capability is endowed to enhance feature extraction capability and prevent small target detail loss. Through a large number of experiments on multiple single-person and multi-person datasets, this method has greater advantages. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
45. Recognizing Gastrointestinal Malignancies on WCE and CCE Images by an Ensemble of Deep and Handcrafted Features with Entropy and PCA Based Features Optimization.
- Author
-
Naz, Javeria, Sharif, Muhammad, Raza, Mudassar, Shah, Jamal Hussain, Yasmin, Mussarat, Kadry, Seifedine, and Vimal, S.
- Subjects
GASTROINTESTINAL cancer ,COMPUTER-aided diagnosis ,DEEP learning ,CAPSULE endoscopy ,GASTRIC diseases ,ENTROPY - Abstract
In medical imaging, automated detection of stomach and gastrointestinal diseases using WCE (wireless capsule endoscopy) images is an emerging research domain. It includes numerous limitations and challenges such as variation in the contrast, texture variation, color and complexity in the background etc. To overcome these challenges, several computer-aided methods are proposed by the researchers. But there exist different limitations in these methods. In this work, a new method is proposed for computer-aided diagnosis of stomach disease classification. This hybrid approach is based on the amassed texture and deep CNN features. Initially, the contrast of image is improved by using power-law transformation. Texture features are extracted from the enhanced dataset by using LBP and SFTA features. Extracted texture features are then fused to obtain strong feature vectors. At the same time, two pre-trained deep learning models are utilized for CNN feature extraction namely VGG16 and InceptionV3. Extracted deep features are fused serially along with the obtained handcrafted feature vector to obtain an ensemble deep feature vector. A unique feature vector is obtained by serial fusion of both fused vectors to get an advantage of accumulated texture and CNN features. This feature vector is supplied to various classifiers and evaluated with existing methods. The promising recognition proficiency portrays the strength of proposed approach. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
46. An efficient deep convolutional neural network with features fusion for radar signal recognition.
- Author
-
Si, Weijian, Wan, Chenxia, and Deng, Zhian
- Subjects
ARTIFICIAL neural networks ,CONVOLUTIONAL neural networks ,MILITARY electronics ,SIGNAL-to-noise ratio ,ELECTRONIC systems ,SPEECH perception - Abstract
This paper proposes an efficient deep convolutional neural network with features fusion for recognizing radar signal, which mainly includes data pre-processing, features extraction, multi-features fusion, and classification. Radar signals are first transformed into time-frequency images by using choi-williams distribution and smooth pseudo-wigner-ville distribution, and the image pre-processing methods are used to resize and normalize the time-frequency images. Then, two constructed deep convolutional neural network models are aimed to extract more effective features. Furthermore, a multi-features fusion model is proposed to integrate features extracted from two deep convolutional neural network models, which makes full use of the relationship among different features and further improves the recognition performance. Experimental results shown that the average recognition accuracy of the proposed method is up to 84.38% when the signal to noise ratio is at −12 dB, and even reach to 94.31% at −10 dB, which achieved the superior recognition performance than others, especially at the lower signal to noise ratio. Moreover, the recognition performance of various radar signals can be largely improved, especially for 2FSK, 4FSK and SFM. This work provides a sound experimental foundation for further improving radar signal recognition in modern electronic warfare systems. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
47. 基于混合模型的事件触发词抽取.
- Author
-
杨昊, 赵刚, and 王兴芬
- Abstract
Although structural grammatical features and semantic features of events have their respective advantages, and the integration of the two features is conducive to accurately represent event trigger words and helpful to complete event trigger word extraction. However, existing feature-based, structure-based and neural network model-based extraction methods can only capture partial features of events, and cannot accurately represent event trigger words. In order to solve the above problems, a hybrid model combining event structural grammatical features with event semantic features is proposed to complete the task of event trigger word extraction. The hybrid model firstly integrates the sentence dependency syntax information into the initial vector model, so that the initial vector integrates the event structural grammatical features. Then, the initial vector is successively introduced into the CNN and BiGRU-E-attention models of the neural network model, and the events semantic features of multi-dimensional are captured. It also completes the feature fusion of event structural grammatical features and event semantic features, and finally completes the extraction of event trigger words. Experimental results on CEC Chinese Emergency Corpus show that the hybrid model improves the F values in the position recognition and classification tasks of event trigger words by 0.86% and 4.07%, respectively, compared with the baseline model. Experimental results on ACE2005 English corpus show that the hybrid model improves the F values in the position recognition and classification tasks of event trigger words by 1.4% and 1.5%, respectively, compared with the baseline model. The experimental results show that the hybrid model achieves excellent results in the task of event trigger word extraction. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
48. BF2SkNet: best deep learning features fusion-assisted framework for multiclass skin lesion classification
- Author
-
Ajmal, Muhammad, Khan, Muhammad Attique, Akram, Tallha, Alqahtani, Abdullah, Alhaisoni, Majed, Armghan, Ammar, Althubiti, Sara A., and Alenezi, Fayadh
- Published
- 2023
- Full Text
- View/download PDF
49. An intelligent healthcare framework for breast cancer diagnosis based on the information fusion of novel deep learning architectures and improved optimization algorithm.
- Author
-
Jabeen, Kiran, Khan, Muhammad Attique, Damaševičius, Robertas, Alsenan, Shrooq, Baili, Jamel, Zhang, Yu-Dong, and Verma, Amit
- Subjects
- *
CONVOLUTIONAL neural networks , *OPTIMIZATION algorithms , *CANCER diagnosis , *FEATURE selection , *MACHINE learning , *DEEP learning - Abstract
Breast cancer is diagnosed using mammography imaging. Mammography is an effective screening tool for diagnosing and managing breast cancer. This task is highly time-consuming due to the high similarity between benign and malignant cells. However, for medical intervention, it is important to diagnose breast cancer at an early stage. Recently, deep learning has shown remarkable success in the area of medical imaging for the diagnosis of several cancer types. The DL-based computerized techniques assist in detecting and classifying breast cancer correctly. This article proposes a new computerized architecture based on two novel CNN architectures with Bayesian Optimization and feature selection techniques. Initially, two convolutional neural network (CNN) architectures were designed, named 2-Residual Blocks CNN and 3-Residual Blocks CNN. Both designed architectures have been trained using Mammography images, where hyperparameters have been initialized using Bayesian Optimization (BO). After the training from scratch, deep features are extracted from the average pool layer. Extracted deep features are optimized using an improved optimization algorithm named Simulated Annealing controlled Position Shuffling (SAcPS). The fitness of each iteration is computed using an extreme learning machine (ELM) classifier instead of a fine k-nearest neighbor. The selected features of both CNN architectures are finally fused using a novel serial-controlled Reyni Entropy technique. The fused feature vector is passed to neural networks for final classification. The experimental process was conducted on two publicly available datasets and obtained improved accuracy of 97.7% and 97.3%, respectively. In addition, a detailed comparison is conducted with several recent techniques and shows improved performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
50. TCM-Net: Mixed Global–Local Learning for Salient Object Detection in Optical Remote Sensing Images
- Author
-
Junkang He, Lin Zhao, Wenjing Hu, Guoyun Zhang, Jianhui Wu, and Xinping Li
- Subjects
optical remote sensing images (ORSIs) ,salient object detection (SOD) ,global context ,local detail ,features fusion ,Science - Abstract
Deep-learning methods have made significant progress for salient object detection in optical remote sensing images (ORSI-SOD). However, it is difficult for existing methods to effectively exploit both the multi-scale global context and local detail features due to the cluttered background and different scales that characterize ORSIs. To solve the problem, we propose a transformer and convolution mixed network (TCM-Net), with a U-shaped codec architecture for ORSI-SOD. By using a dual-path complementary network, we obtain both the global context and local detail information from the ORSIs of different resolution. A local and global features fusion module was developed to integrate the information at corresponding decoder layers. Furthermore, an attention gate module was designed to refine features while suppressing noise at each decoder layer. Finally, we tailored a hybrid loss function to our network structure, which incorporates three supervision strategies: global, local and output. Extensive experiments were conducted on three common datasets, and TCM-Net outperforms 17 state-of-the-art methods.
- Published
- 2023
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.