745 results on '"MobileNetv2"'
Search Results
2. Predicting Salinity Resistance of Rice at the Seedling Stage: An Evaluation of Transfer Learning Methods
- Author
-
Shiragudikar, Sharada K., Bharamagoudar, Geeta, K., Manohara K., Y., Malathi S., G.Totad, Shashikumar, Li, Gang, Series Editor, Filipe, Joaquim, Series Editor, Ghosh, Ashish, Series Editor, Xu, Zhiwei, Series Editor, T., Shreekumar, editor, L., Dinesha, editor, and Rajesh, Sreeja, editor
- Published
- 2025
- Full Text
- View/download PDF
3. Predictive Models for the Early Diagnosis and Prognosis of Knee Osteoarthritis Using Deep Learning Techniques
- Author
-
Malathi, S. Y., Bharamagoudar, Geeta, Shiragudikar, Sharada K., G. Totad, Shashikumar, Li, Gang, Series Editor, Filipe, Joaquim, Series Editor, Ghosh, Ashish, Series Editor, Xu, Zhiwei, Series Editor, T., Shreekumar, editor, L., Dinesha, editor, and Rajesh, Sreeja, editor
- Published
- 2025
- Full Text
- View/download PDF
4. Segmentation and Classification of Unharvested Arecanut Bunches Using Deep Learning
- Author
-
Dhanesha, R., Umesha, D. K., Hiremath, Gurudeva Shastri, Girish, G. N., Shrinivasa Naika, C. L., Li, Gang, Series Editor, Filipe, Joaquim, Series Editor, Ghosh, Ashish, Series Editor, Xu, Zhiwei, Series Editor, T., Shreekumar, editor, L., Dinesha, editor, and Rajesh, Sreeja, editor
- Published
- 2025
- Full Text
- View/download PDF
5. FaceEvoke: Eliciting Emotions Through Facial Analysis
- Author
-
Gupta, Aayushi, Srivastava, Ayushya, Shukla, Manoj Kumar, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Khurana, Meenu, editor, Thakur, Abhishek, editor, Kantha, Praveen, editor, Shieh, Chin-Shiuh, editor, and Shukla, Rajesh K., editor
- Published
- 2025
- Full Text
- View/download PDF
6. Enhancing Monkeypox Disease Detection Using Computer Vision-Based Approaches and Deep Learning
- Author
-
Ahmed, Imtiaj, Rayan, Tihany, Sayma Akter, Mahmud, Adnan, Ghosh, Ashish, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Bairwa, Amit Kumar, editor, Tiwari, Varun, editor, Vishwakarma, Santosh Kumar, editor, Tuba, Milan, editor, and Ganokratanaa, Thittaporn, editor
- Published
- 2025
- Full Text
- View/download PDF
7. Mobile-YOLO-SDD: A Lightweight YOLO for Real-time Steel Defect Detection.
- Author
-
Luo, Shen, Xu, Yuanping, Zhu, Ming, Zhang, Chaolong, Kong, Chao, Jin, Jin, Li, Tukun, Jiang, Xiangqian, and Guo, Benjun
- Abstract
Defect detection is essential in the steel production process. Recent years have seen significant advancements in steel surface defect detection based on deep learning methods, notably exemplified by the YOLO series models capable of precise and rapid detection. However, challenges arise due to the high complexity of surface textures on steel and the low recognition rates for minor defects, making real-time and accurate detection difficult. This study introduces Mobile-YOLO-SDD (Steel Defect Detection), a lightweight YOLO-based model designed with high accuracy for real-time steel defect detection. Firstly, based on the effective YOLOv5 algorithm for steel defect detection, the backbone network is replaced with MobileNetV2 to reduce the model size and computational complexity. Then, the ECA (Efficient Channel Attention) module was integrated into the C3 module to reduce the number of parameters further while maintaining the defect detection rate in complex backgrounds. Finally, the K-Means++ algorithm regenerates anchor boxes and determines optimal sizes, enhancing their adaptability to actual targets. Experimental results on NEU-DET data demonstrate that the improved algorithm achieves a 60.6% reduction in model size, a 60.8% reduction in FLOPs, and a 1.8% improvement in mAP compared to YOLOv5s. These results confirm the effectiveness of Mobile-YOLO-SDD and lay the foundation for subsequent lightweight deployment of steel defect detection models. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. Age-Invariant Cross-Age Face Verification using Transfer Learning.
- Author
-
Russel, Newlin Shebiah, Selvaraj, Arivazhagan, S., Dhanya Devi, and M., Dhivyarupini
- Subjects
- *
FEATURE extraction , *SKIN aging , *SECURITY systems , *HISTOGRAMS , *SWINE - Abstract
The integration of face verification technology has become indispensable in numerous safety and security software systems. Despite its promising results, the field of face verification encounters significant challenges due to age-related disparities. Human facial characteristics undergo substantial transformations over time, leading to diverse variations including changes in facial texture, morphology, facial hair, and eyeglass adoption. This study presents a pioneering methodology for cross-age face verification, utilizing advanced deep learning techniques to extract resilient and distinctive facial features that are less susceptible to age-related fluctuations. The feature extraction process combines handcrafted features like Local Binary Pattern/Histogram of Oriented Gradients with deep features from MobileNetV2 and VGG-16 networks. As the texture of the facial skin defines the age related characteristic the well-known texture feature extractors like LBP and HoG is preferred. These features are concatenated to achieve fusion, and subsequent layers fine-tune them. Experimental validation utilizing the Cross-Age Celebrity Dataset demonstrates remarkable efficacy, achieving an accuracy of 98.32%. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. Multimodal Fusion Network for Crack Segmentation with Modified U-Net and Transfer Learning–Based MobileNetV2.
- Author
-
Qiu, Shi, Zaheer, Qasim, Ehsan, Haleema, Hassan Shah, Syed Muhammad Ahmed, Ai, Chengbo, Wang, Jin, and Zheng, Allen A.
- Subjects
INFRASTRUCTURE (Economics) ,AWARENESS - Abstract
This study introduces a state-of-the-art methodology for addressing crack segmentation challenges in structure health monitoring, a crucial concern in infrastructure maintenance. The main objective is to enhance real-time crack monitoring through a cutting-edge multimodal fusion approach that intricately combines a modified U-Net with transfer learning-based MobileNetV2. This integration strategically amalgamates spatial awareness and long-range dependency capture, resulting in an advanced model for crack segmentation. Thorough evaluations of a specialized crack detection data set underscore the efficacy of this integrated approach, positioning it as a reliable solution for real-time crack monitoring. Notably, the choice of MobileNetV2, recognized for its efficiency with the least parameters, contributes to the fusion's effectiveness. This study reveals superior performance, particularly when MobileNetV2 is integrated with U-Net, demonstrating enhanced accuracy and Intersection over Union (IOU) scores. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. Braille code classifications tool based on computer vision for visual impaired.
- Author
-
Sadak, Hany M., Khalaf, Ashraf A. M., Hussein, Aziza I., and Salama, Gerges Mansour
- Subjects
COMPUTER vision ,ELECTRIC circuits ,ASSISTIVE technology ,PEOPLE with visual disabilities ,ELECTRONIC circuits ,DEEP learning - Abstract
Blind and visually impaired people (VIP) face many challenges in writing as they usually use traditional tools such as Slate and Stylus or expensive typewriters as Perkins Brailler, often causing accessibility and affordability issues. This article introduces a novel portable, cost-effective device that helps VIP how to write by utilizing a deep-learning model to detect a Braille cell. Using deep learning instead of electrical circuits can reduce costs and enable a mobile app to act as a virtual teacher for blind users. The app could suggest sentences for the user to write and check their work, providing an independent learning platform. This feature is difficult to implement when using electronic circuits. A portable device generates Braille character cells using lightemitting diode (LED) arrays instead of Braille holes. A smartphone camera captures the image, which is then processed by a deep learning model to detect the Braille and convert it to English text. This article provides a new dataset for custom-Braille character cells. Moreover, applying a transfer learning technique on the mobile network version 2 (MobileNetv2) model offers a basis for the development of a comprehensive mobile application. The accuracy based on the model reached 97%. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. Two-Stream Modality-Based Deep Learning Approach for Enhanced Two-Person Human Interaction Recognition in Videos.
- Author
-
Akash, Hemel Sharker, Rahim, Md Abdur, Miah, Abu Saleh Musa, Lee, Hyoun-Sup, Jang, Si-Woong, and Shin, Jungpil
- Subjects
- *
PATTERN recognition systems , *SOCIAL interaction , *FEATURE extraction , *VISUAL fields , *RELIABILITY in engineering , *DEEP learning - Abstract
Human interaction recognition (HIR) between two people in videos is a critical field in computer vision and pattern recognition, aimed at identifying and understanding human interaction and actions for applications such as healthcare, surveillance, and human–computer interaction. Despite its significance, video-based HIR faces challenges in achieving satisfactory performance due to the complexity of human actions, variations in motion, different viewpoints, and environmental factors. In the study, we proposed a two-stream deep learning-based HIR system to address these challenges and improve the accuracy and reliability of HIR systems. In the process, two streams extract hierarchical features based on the skeleton and RGB information, respectively. In the first stream, we utilised YOLOv8-Pose for human pose extraction, then extracted features with three stacked LSM modules and enhanced them with a dense layer that is considered the final feature of the first stream. In the second stream, we utilised SAM on the input videos, and after filtering the Segment Anything Model (SAM) feature, we employed integrated LSTM and GRU to extract the long-range dependency feature and then enhanced them with a dense layer that was considered the final feature for the second stream module. Here, SAM was utilised for segmented mesh generation, and ImageNet was used for feature extraction from images or meshes, focusing on extracting relevant features from sequential image data. Moreover, we newly created a custom filter function to enhance computational efficiency and eliminate irrelevant keypoints and mesh components from the dataset. We concatenated the two stream features and produced the final feature that fed into the classification module. The extensive experiment with the two benchmark datasets of the proposed model achieved 96.56% and 96.16% accuracy, respectively. The high-performance accuracy of the proposed model proved its superiority. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. Deep Learning-Based Surface Defect Detection in Steel Products Using Convolutional Neural Networks.
- Author
-
Khan, Irfan Ullah, Aslam, Nida, Aboulnour, Menna, Bashamakh, Asma, Alghool, Fatima, Alsuwayan, Noorah, Alturaif, Rawaa, Gull, Hina, Iqbal, Sardar Zafar, and Hussain, Tariq
- Abstract
In mechanical engineering, monitoring steel surface defects is crucial for ensuring the quality of industrial products, as these defects account for over 90% of flaws in steel items. Traditional manual inspection methods are time-consuming and may overlook some defects. To address these challenges, this study introduces an automated deep learning (DL) model for continuous monitoring of steel surface defects using real-world images from the Industrial Machine Tool Component Surface Defect (IMTCSD) dataset, which includes 1,104 three-channel images, 394 of which are categorized as exhibiting "pitting" damage. This study evaluated several Convolutional Neural Network (CNN) classifiers: EfficientNetB3, ResNet-50, and MobileNetV2, to determine the most effective model for defect detection. EfficientNetB3 is distinguished by its scalable architecture that adapts efficiently across various image dimensions, making it ideal for high-accuracy applications on limited computational resources. ResNet-50 uses residual connections to maintain performance in deeper networks by facilitating smooth gradient flow, yet it requires more computational power. MobileNetV2, designed for real-time applications on devices with limited resources, uses lightweight depthwise separable convolutions. The performance of these models was assessed using accuracy, recall, precision, specificity, F1-score, and AUC metrics. EfficientNetB3 emerged as the best performing model, achieving an accuracy of 0.981, specificity of 0.975, recall of 0.987, precision of 0.975, and an F1-score of 0.982. This model proved effective in detecting defects even on dirty surfaces, demonstrating its potential to significantly enhance quality control in industrial settings. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
13. Rice Leaf Disease Classification—A Comparative Approach Using Convolutional Neural Network (CNN), Cascading Autoencoder with Attention Residual U-Net (CAAR-U-Net), and MobileNet-V2 Architectures.
- Author
-
Dutta, Monoronjon, Islam Sujan, Md Rashedul, Mojumdar, Mayen Uddin, Chakraborty, Narayan Ranjan, Marouf, Ahmed Al, Rokne, Jon G., and Alhajj, Reda
- Subjects
CONVOLUTIONAL neural networks ,RICE diseases & pests ,MACHINE learning ,AGRICULTURAL technology ,COMPARATIVE method ,DEEP learning - Abstract
Classifying rice leaf diseases in agricultural technology helps to maintain crop health and to ensure a good yield. In this work, deep learning algorithms were, therefore, employed for the identification and classification of rice leaf diseases from images of crops in the field. The initial algorithmic phase involved image pre-processing of the crop images, using a bilateral filter to improve image quality. The effectiveness of this step was measured by using metrics like the Structural Similarity Index (SSIM) and the Peak Signal-to-Noise Ratio (PSNR). Following this, this work employed advanced neural network architectures for classification, including Cascading Autoencoder with Attention Residual U-Net (CAAR-U-Net), MobileNetV2, and Convolutional Neural Network (CNN). The proposed CNN model stood out, since it demonstrated exceptional performance in identifying rice leaf diseases, with test Accuracy of 98% and high Precision, Recall, and F1 scores. This result highlights that the proposed model is particularly well suited for rice leaf disease classification. The robustness of the proposed model was validated through k-fold cross-validation, confirming its generalizability and minimizing the risk of overfitting. This study not only focused on classifying rice leaf diseases but also has the potential to benefit farmers and the agricultural community greatly. This work highlights the advantages of custom CNN models for efficient and accurate rice leaf disease classification, paving the way for technology-driven advancements in farming practices. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
14. Segmentation and classification of white blood SMEAR images using modified CNN architecture.
- Author
-
Kumar, Indrajeet and Rawat, Jyoti
- Abstract
The classification and recognition of leukocytes or WBCs in blood smear images presents a key role in the corresponding diagnosis of specific diseases, such as leukemia, tumor, hematological disorders, etc. The computerized framework for automated segmentation & classification of WBCs nucleus contributes an important role for the recognition of WBCs related disorders. Therefore, this work emphasizes WBCs nucleus segmentation using modified U-Net architecture and the segmented WBCs nucleus are further classified into their subcategory i.e., basophil, eosinophil, neutrophil, monocyte and lymphocyte. The classification and nucleus characterization task has been performed using VGGNet and MobileNet V2 architecture. Initially, collected instances are passed to the preprocessing phase for image rescaling and normalization. The rescaled and normalized instances are passed to the U-Net model for nucleus segmentation. Extracted nucleus are forwarded to the classification phase for their class identifications. Furthermore, the functioning of the intended design will be compared with other modern methods. By the end of this study a successful model classifying various nucleus morphologies such as Basophil, Eosinophil, Lymphocyte, Monocyte and Neutrophil was obtained where overall test accuracy achieved was 97.0% for VGGNet classifier and 94.0% for MobileNet V2 classifier. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
15. Multistage traffic sign recognition under harsh environment.
- Author
-
Chandnani, Manali, Shukla, Sanyam, and Wadhvani, Rajesh
- Subjects
EXTREME weather ,TRAFFIC signs & signals ,CONVOLUTIONAL neural networks ,TRAFFIC noise - Abstract
This paper examines the impact of rain on traffic sign recognition system, addressing one of the challenges posed by harsh environmental conditions like low lighting, extreme weather (rain,fog, snow) and reduced sign visibility. A novel system is proposed in this work, which is capable of handling three different rain types (drizzle, torrential, and heavy). This work explores how different rain types affect training and testing of three customized convolutional neural networks for traffic sign recognition. Results show that the system's performance is dependent on the rain type during training and testing. To address this variability, a multistage classifier is proposed: level 1 classifies rain type, and level 2 selects an appropriate traffic sign classifier based on output of level1. This work also analyzes the effect of augmenting three different types of rain for developing noise robust traffic sign recognition system. Experiments were conducted using publicly available German Traffic Sign Recognition Benchmark dataset. The proposed system attains overall classification accuracy of 95.10% measured through the accuracy score metric, which has not been claimed till now in any previous work in addressing the challenges of recognizing traffic signs in presence of different types of rain. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. Multi-label dental disorder diagnosis based on MobileNetV2 and swin transformer using bagging ensemble classifier.
- Author
-
Alsakar, Yasmin M., Elazab, Naira, Nader, Nermeen, Mohamed, Waleed, Ezzat, Mohamed, and Elmogy, Mohammed
- Subjects
- *
TRANSFORMER models , *DENTAL radiography , *MACHINE learning , *X-ray imaging , *FEATURE extraction - Abstract
Dental disorders are common worldwide, causing pain or infections and limiting mouth opening, so dental conditions impact productivity, work capability, and quality of life. Manual detection and classification of oral diseases is time-consuming and requires dentists' evaluation and examination. The dental disease detection and classification system based on machine learning and deep learning will aid in early dental disease diagnosis. Hence, this paper proposes a new diagnosis system for dental diseases using X-ray imaging. The framework includes a robust pre-processing phase that uses image normalization and adaptive histogram equalization to improve image quality and reduce variation. A dual-stream approach is used for feature extraction, utilizing the advantages of Swin Transformer for capturing long-range dependencies and global context and MobileNetV2 for effective local feature extraction. A thorough representation of dental anomalies is produced by fusing the extracted features. To obtain reliable and broadly applicable classification results, a bagging ensemble classifier is utilized in the end. We evaluate our model on a benchmark dental radiography dataset. The experimental results and comparisons show the superiority of the proposed system with 95.7% for precision, 95.4% for sensitivity, 95.7% for specificity, 95.5% for Dice similarity coefficient, and 95.6% for accuracy. The results demonstrate the effectiveness of our hybrid model integrating MoileNetv2 and Swin Transformer architectures, outperforming state-of-the-art techniques in classifying dental diseases using dental panoramic X-ray imaging. This framework presents a promising method for robustly and accurately diagnosing dental diseases automatically, which may help dentists plan treatments and identify dental diseases early on. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. Weed detection and classification in sesame crops using region-based convolution neural networks.
- Author
-
Naik, Nenavath Srinivas and Chaubey, Harshit Kumar
- Subjects
- *
CONVOLUTIONAL neural networks , *OBJECT recognition (Computer vision) , *WEED control , *COMPUTER vision , *AGRICULTURE - Abstract
Farming has many moving parts, including planting, watering, harvesting, and more. One of their most complex and time-consuming is keeping an eye out for and controlling weeds that might ruin a harvest. Unwanted weeds cause decreased crop productivity by competing with desired agricultural plants for water, sunshine, and soil nutrients. This research aims to use Region-Based Convolutional Neural Networks (RCNNs) to detect weeds in photographs of sesame crops and then classify them into their respective weed families. Object detection is a promising use of deep learning, and the suggested method takes advantage of RCNNs, a prominent method. By applying RCNNs to sesame crop images, we could accurately identify the presence of weeds, achieving an impressive detection accuracy of 96.84%. This high accuracy can significantly aid farmers in pinpointing areas of their fields that require immediate attention and weed management strategies. Furthermore, after successfully detecting weeds, we classified them into different types. This classification step is crucial as different weed species require specific control measures. Our proposed methodology achieved an outstanding weed classification accuracy of 97.79%. By correctly categorizing weeds, farmers better understand the weed composition in their fields, making it easier to use targeted control methods and lessen the use of possibly dangerous chemicals with a wide range of effects. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. MobileNetV2-Incep-M: a hybrid lightweight model for the classification of rice plant diseases.
- Author
-
Arya, Akash and Mishra, Pankaj Kumar
- Subjects
RICE diseases & pests ,PLANT diseases ,PLANT classification ,NOSOLOGY ,AGRICULTURAL productivity - Abstract
The complex structure of the automatic rice detection model results in a delay in identifying diseases and may require higher computational power. To overcome this challenge, we introduced a novel lightweight model called MobileNetV2-Incep-M. MobileNetV2-Incep-M, is designed for rice plant disease classification, aiming to balance efficiency and performance. It combines MobileNetV2 with a single Inception module to create a lightweight architecture. Leveraging transfer learning, the model initializes with pre-trained weights from MobileNetV2 on ImageNet. The Inception module is seamlessly integrated, followed by a max pooling layer for down sampling and parameter reduction. Lastly, a flatten layer and fully connected layer are added for classification purposes. During the training phase we utilized the k-fold cross validation method to reduce the training biasness. The proposed model attained a maximum testing accuracy of 98.75%, a testing loss of 0.0302, and is characterized by the minimal training parameters of 2,502,468, with an average training duration of 464.85 s. We evaluated the proposed model by comparing with five other models, namely InceptionV3, VGG19, MobileNet, MobileNetV2, and DenseNet201. The dataset consists of 5624 images, including Bacterial blight, Leaf Blast, and Brown Spot, and Healthy. The proposed model outperforms the other models, achieving higher accuracy and improved detection of rice plant diseases. Such lightweight model can contribute to the early identification and effective management of rice plant diseases, which can have a substantial impact on agricultural productivity and food security worldwide. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
19. Rice Leaf Diseases Classification Using Deep Learning Algorithms for Smartphone Application Development: An Empirical Study.
- Author
-
Suting, Albania, Kumar, Ansuman, Halder, Anindya, and Chanu, Leimapokpam Lousingkhombi
- Subjects
- *
MACHINE learning , *STATISTICAL hypothesis testing , *EARLY diagnosis , *MOBILE apps , *DEEP learning , *NOSOLOGY - Abstract
Rice is the most widely consumed grain across the world. The rice plants often suffer from diseases. Early detection of such diseases and adopting remedial measures can help the farmers to avoid major losses and can produce best-quality crops in large quantities. However, the conventional rice leaf disease detection techniques are often not accurate, time-consuming and sometimes require laboratory testing. In this context, automatic rice leaf disease detection techniques are presented based on the various deep learning classifiers (namely MobileNetV2, ResNet50, VGG16 and Le-Net5) and an Android application is also developed in order to instantly determine the possible rice diseases from the uploaded rice leaf images captured by the smartphone. The developed models are tested using the publicly available benchmark rice leaf dataset containing three types of rice leaf diseases, namely bacterial leaf blight, leaf smut and brown spot. Experimental results show that MobileNetV2 model performed better compared to other models in terms of classification accuracy, recall and F1-score. The results of statistical significance test also confirmed the superiority of the MobileNetV2 model over other compared deep learning models. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
20. Enhancing melanoma skin cancer classification through data augmentation.
- Author
-
M’hamedi, Mohammed, Merzoug, Mohammed, Hadjila, Mourad, and Bekkouche, Amina
- Subjects
- *
CONVOLUTIONAL neural networks , *DATA augmentation , *SKIN imaging , *SKIN cancer , *VISUAL learning - Abstract
Skin cancer is a dangerous and prevalent cancer illness. It is the abnormal growth of cells in the outermost of the skin. Currently, it has received tremendous attention, highlighting an urgent need to address this worldwide public health crisis. The purpose of this study is to propose a convolutional neural network (CNN) to help dermatology physicians in the inspection, identification, and diagnosis of skin cancer. More precisely, we offer an automated method that leverages deep learning techniques to categorize binary categories of skin lesions. Our technique enlarges skin cancer by utilizing data pre-processing and augmentation to address the imbalanced class problem. Subsequently, fine-tuning is conducted on the pre-trained models visual geometry group (VGG-19) and MobileNetV2 to extract and classify the image features using transfer learning. The model is tested on the society for imaging informatics in medicine international skin imaging collaboration (SIIM-ISIC) 2020 dataset and achieved an accuracy of 95.16%, sensitivity of 90.83%, specificity of 99.2%, area under curve (AUC) of 97.57%, and precision of 99.06%. The proposed model based on MobileNetV2 outperforms the other techniques. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
21. Transfer Learning-Based Skin Tumor Identification Improvement.
- Author
-
Maharani, Shafira and Utari, Dina Tri
- Subjects
CONVOLUTIONAL neural networks ,SKIN tumors ,DEATH rate ,SKIN cancer ,TREATMENT effectiveness - Abstract
This study addresses the critical challenge of accurately identifying skin disorders as benign, malignant, or non-tumors, essential for timely and successful treatment. Early identification can greatly minimize tumor development and cut fatality rates. Given the high costs involved with standard medical detection approaches, this research addresses using sophisticated Convolutional Neural Networks (CNNs) with transfer learning to categorize skin malignancies efficiently. Specifically, the study assesses the performance of MobileNetV2, VGG16, and VGG19 architectures. The primary objective is to find which model has the maximum accuracy in classifying skin cancers. Our findings reveal that while a standard CNN reached an accuracy of 62.2%, the transfer learning models greatly outperformed it, with MobileNetV2 achieving the highest accuracy at 93.9%, followed by VGG19 at 90.0% and VGG16 at 88.9%. These results imply that MobileNetV2 is the most successful solution for this task since it consistently obtained a prediction accuracy of 90% for both in-dataset and out-of-dataset images. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. A Comparison of Convolutional Neural Network (CNN) and Transfer Learning MobileNetV2 Performance on Spices Images Classification
- Author
-
Khoirizqi Velarati, Christy Atika Sari, and Eko Hari Rachmawanto
- Subjects
classification ,convolutional neural network ,mobilenetv2 ,transfer learning ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
This research was conducted to analyze the performance of the CNN algorithm without transfer learning in classifying spice images and compare it with the CNN algorithm using transfer learning on the MobileNetV2 architecture. This comparison aims to evaluate both methods' accuracy, efficiency, and overall performance and analyze the impact of transfer learning on classification results in the context of spices. The dataset consists of 1500 spice images divided into 10 classes, with each class of 150 images. In the first experiment, CNN without transfer learning resulted in 93% accuracy performance. For the second experiment using MobileNetV2, there was an increase in accuracy, reaching a value of 99% for all spice classes. The results of this study confirm that MobileNetV2 architecture significantly improves the accuracy and performance of spice classification compared to CNN without transfer learning, which can be recommended for spice image classification.
- Published
- 2024
- Full Text
- View/download PDF
23. Segmentation and classification of white blood SMEAR images using modified CNN architecture
- Author
-
Indrajeet Kumar and Jyoti Rawat
- Subjects
White blood cells ,Segmentation ,Classification ,U-Net ,VGG-Net ,MobileNetV2 ,Science (General) ,Q1-390 - Abstract
Abstract The classification and recognition of leukocytes or WBCs in blood smear images presents a key role in the corresponding diagnosis of specific diseases, such as leukemia, tumor, hematological disorders, etc. The computerized framework for automated segmentation & classification of WBCs nucleus contributes an important role for the recognition of WBCs related disorders. Therefore, this work emphasizes WBCs nucleus segmentation using modified U-Net architecture and the segmented WBCs nucleus are further classified into their subcategory i.e., basophil, eosinophil, neutrophil, monocyte and lymphocyte. The classification and nucleus characterization task has been performed using VGGNet and MobileNet V2 architecture. Initially, collected instances are passed to the preprocessing phase for image rescaling and normalization. The rescaled and normalized instances are passed to the U-Net model for nucleus segmentation. Extracted nucleus are forwarded to the classification phase for their class identifications. Furthermore, the functioning of the intended design will be compared with other modern methods. By the end of this study a successful model classifying various nucleus morphologies such as Basophil, Eosinophil, Lymphocyte, Monocyte and Neutrophil was obtained where overall test accuracy achieved was 97.0% for VGGNet classifier and 94.0% for MobileNet V2 classifier.
- Published
- 2024
- Full Text
- View/download PDF
24. Multi-label dental disorder diagnosis based on MobileNetV2 and swin transformer using bagging ensemble classifier
- Author
-
Yasmin M. Alsakar, Naira Elazab, Nermeen Nader, Waleed Mohamed, Mohamed Ezzat, and Mohammed Elmogy
- Subjects
Dentistry ,MobileNetV2 ,Swin transformer ,Annotation ,Deep learning ,Feature extraction ,Medicine ,Science - Abstract
Abstract Dental disorders are common worldwide, causing pain or infections and limiting mouth opening, so dental conditions impact productivity, work capability, and quality of life. Manual detection and classification of oral diseases is time-consuming and requires dentists’ evaluation and examination. The dental disease detection and classification system based on machine learning and deep learning will aid in early dental disease diagnosis. Hence, this paper proposes a new diagnosis system for dental diseases using X-ray imaging. The framework includes a robust pre-processing phase that uses image normalization and adaptive histogram equalization to improve image quality and reduce variation. A dual-stream approach is used for feature extraction, utilizing the advantages of Swin Transformer for capturing long-range dependencies and global context and MobileNetV2 for effective local feature extraction. A thorough representation of dental anomalies is produced by fusing the extracted features. To obtain reliable and broadly applicable classification results, a bagging ensemble classifier is utilized in the end. We evaluate our model on a benchmark dental radiography dataset. The experimental results and comparisons show the superiority of the proposed system with 95.7% for precision, 95.4% for sensitivity, 95.7% for specificity, 95.5% for Dice similarity coefficient, and 95.6% for accuracy. The results demonstrate the effectiveness of our hybrid model integrating MoileNetv2 and Swin Transformer architectures, outperforming state-of-the-art techniques in classifying dental diseases using dental panoramic X-ray imaging. This framework presents a promising method for robustly and accurately diagnosing dental diseases automatically, which may help dentists plan treatments and identify dental diseases early on.
- Published
- 2024
- Full Text
- View/download PDF
25. Diffusion model for multi-scale ship object detection and recognition in remote sensing images.
- Author
-
Chen, Lei, Wang, Bin, Liu, Ying, Zhao, Shuang, Guan, Qinghe, and Li, Guandian
- Abstract
Ship object detection and recognition in remote sensing images (RSIs) is a challenging task due to the multi-scale and complex background characteristics of ship objects. Currently, convolution-based methods cannot adequately solve these problems. Firstly, this paper first applies the diffusion model to the task of ship object detection and recognition in RSIs, and proposes a new diffusion model for multi-scale ship object detection and recognition in remote sensing images (MSDiffDet). Secondly, in order to reduce the loss of multi-scale information in the feature extraction process, this paper proposes the Channel Fusion FPN (CF-FPN) based on FPN and constructs the Large-Scale Feature Enhancement Module (LSFEM), which further enhances the algorithm’s ability to extract large-scale ship object features and improves the detection accuracy of ship objects in RSIs. Finally, this paper prunes and reconstructs MobileNetV2 to obtain the Sparse MobileNetV2, which is used as the backbone network of the image encoder, which enhances detection accuracy while reducing the overall parameter count of the algorithm. The experimental results demonstrate that the MSDiffDet algorithm is effective in detecting and recognizing four types of remote sensing ship objects: aircraft carriers, warships, commercial ships, and submarines. The m A P 0.5 achieved a notable 89.8%. A significant improvement of 5.8% in m A P 0.5 is observed compared to the DiffusionDet algorithm, indicating the potential of the MSDiffDet algorithm for applications in remote sensing ship object detection and recognition. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
26. A hybrid features fusion-based framework for classification of breast micronodules using ultrasonography
- Author
-
Mousa Alhajlah
- Subjects
Breast cancer detection ,Hybrid CNN framework ,InceptionV3 ,MobileNetV2 ,Computer-aided diagnosis (CAD) ,Ultrasonography ,Medical technology ,R855-855.5 - Abstract
Abstract Background Breast cancer is one of the leading diseases worldwide. According to estimates by the National Breast Cancer Foundation, over 42,000 women are expected to die from this disease in 2024. Objective The prognosis of breast cancer depends on the early detection of breast micronodules and the ability to distinguish benign from malignant lesions. Ultrasonography is a crucial radiological imaging technique for diagnosing the illness because it allows for biopsy and lesion characterization. The user’s level of experience and knowledge is vital since ultrasonographic diagnosis relies on the practitioner’s expertise. Furthermore, computer-aided technologies significantly contribute by potentially reducing the workload of radiologists and enhancing their expertise, especially when combined with a large patient volume in a hospital setting. Method This work describes the development of a hybrid CNN system for diagnosing benign and malignant breast cancer lesions. The models InceptionV3 and MobileNetV2 serve as the foundation for the hybrid framework. Features from these models are extracted and concatenated individually, resulting in a larger feature set. Finally, various classifiers are applied for the classification task. Results The model achieved the best results using the softmax classifier, with an accuracy of over 95%. Conclusion Computer-aided diagnosis greatly assists radiologists and reduces their workload. Therefore, this research can serve as a foundation for other researchers to build clinical solutions.
- Published
- 2024
- Full Text
- View/download PDF
27. Comparison of EfficientNetB7 and MobileNetV2 in Herbal Plant Species Classification Using Convolutional Neural Networks
- Author
-
Seno Arnandito and Theopilus Bayu Sasongko
- Subjects
efficientnetb7 ,mobilenetv2 ,convolutional neural networks ,herbal plant classification ,automatic plant recognition ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
This study compares the performance of EfficientNetB7 and MobileNetV2 in classifying herbal plant species using Convolutional Neural Networks (CNNs). The primary objective was to automatically identify herbal plant species with high accuracy. Based on the evaluation results, both EfficientNetB7 and MobileNetV2 achieved approximately 98% accuracy in recognizing herbal plant species. While both models demonstrated excellent performance in precision, recall, and F1-score for most plant species, EfficientNetB7 showed a slight edge in some evaluation metrics. These findings provide valuable insights into the potential implementation of CNN architectures in automatic plant recognition applications, particularly for developing widely applicable web-based systems for herbal plant identification.
- Published
- 2024
- Full Text
- View/download PDF
28. Crack Detection in Building Through Deep Learning Feature Extraction and Machine Learning Approch
- Author
-
Afandi Nur Aziz Thohari, Aisyatul Karima, Kuwat Santoso, and Roselina Rahmawati
- Subjects
crack detection ,deep learning ,mobilenetv2 ,machine learning algorithm ,deployment ,raspberry pi ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
Buildings with cracks are extremely hazardous because they have the potential to cause destruction. Numerous occupants of structures such as houses and buildings are at risk when cracks appear. There are numerous techniques for identifying fractures in structures, including visual inspection, tool use, and expert inspection. The present study employed computer vision, a form of artificial intelligence, to detect cracks in buildings. The main objective of this research is to construct a prototype capable of real-time monitoring of cracks in building walls. This research makes use of a methodology that combines machine learning and deep learning. Machine learning is employed in the classification process, whereas deep learning is utilized for the extraction of features. This research employs MobileNetV2 as its deep learning architecture and K-NN, Naive Bayes, SVM, XGBoost, and Random Forest as its machine learning classifiers. Test results show that when dividing the 80:20 dataset, XGBoost algorithms can produce the highest accuracy, sensitivity, and specificity values of 99%. Tests in the real environment are performed by deploying Raspberry Pi. Test results show that the prototype can detect cracks inthe structure surfaceat a distance of 10 meters in a bright environment. The crack detection process is carried out in real time at an average speed of 42fps.
- Published
- 2024
- Full Text
- View/download PDF
29. A hybrid features fusion-based framework for classification of breast micronodules using ultrasonography.
- Author
-
Alhajlah, Mousa
- Subjects
COMPUTER-aided diagnosis ,EARLY detection of cancer ,BREAST cancer prognosis ,BREAST cancer ,RESEARCH personnel - Abstract
Background: Breast cancer is one of the leading diseases worldwide. According to estimates by the National Breast Cancer Foundation, over 42,000 women are expected to die from this disease in 2024. Objective: The prognosis of breast cancer depends on the early detection of breast micronodules and the ability to distinguish benign from malignant lesions. Ultrasonography is a crucial radiological imaging technique for diagnosing the illness because it allows for biopsy and lesion characterization. The user's level of experience and knowledge is vital since ultrasonographic diagnosis relies on the practitioner's expertise. Furthermore, computer-aided technologies significantly contribute by potentially reducing the workload of radiologists and enhancing their expertise, especially when combined with a large patient volume in a hospital setting. Method: This work describes the development of a hybrid CNN system for diagnosing benign and malignant breast cancer lesions. The models InceptionV3 and MobileNetV2 serve as the foundation for the hybrid framework. Features from these models are extracted and concatenated individually, resulting in a larger feature set. Finally, various classifiers are applied for the classification task. Results: The model achieved the best results using the softmax classifier, with an accuracy of over 95%. Conclusion: Computer-aided diagnosis greatly assists radiologists and reduces their workload. Therefore, this research can serve as a foundation for other researchers to build clinical solutions. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
30. A Performance Evaluation of Convolutional Neural Network Architectures for Pterygium Detection in Anterior Segment Eye Images.
- Author
-
Moreno-Lozano, Maria Isabel, Ticlavilca-Inche, Edward Jordy, Castañeda, Pedro, Wong-Durand, Sandra, Mauricio, David, and Oñate-Andino, Alejandra
- Subjects
- *
CONVOLUTIONAL neural networks , *ANTERIOR eye segment , *DEEP learning , *PTERYGIUM , *OPHTHALMOLOGY - Abstract
In this article, various convolutional neural network (CNN) architectures for the detection of pterygium in the anterior segment of the eye are explored and compared. Five CNN architectures (ResNet101, ResNext101, Se-ResNext50, ResNext50, and MobileNet V2) are evaluated with the objective of identifying one that surpasses the precision and diagnostic efficacy of the current existing solutions. The results show that the Se-ResNext50 architecture offers the best overall performance in terms of precision, recall, and accuracy, with values of 93%, 92%, and 92%, respectively, for these metrics. These results demonstrate its potential to enhance diagnostic tools in ophthalmology. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
31. Detection of Road Risk Sources Based on Multi-Scale Lightweight Networks.
- Author
-
Pang, Rong, Ning, Jiacheng, Yang, Yan, Zhang, Peng, Wang, Jilong, and Liu, Jingxiao
- Subjects
- *
PAVEMENTS , *COORDINATE transformations , *FEATURE extraction , *GRAYSCALE model , *ROAD safety measures - Abstract
Timely discovery and disposal of road risk sources constitute the cornerstone of road operation safety. Presently, the detection of road risk sources frequently relies on manual inspections via inspection vehicles, a process that is both inefficient and time-consuming. To tackle this challenge, this paper introduces a novel automated approach for detecting road risk sources, termed the multi-scale lightweight network (MSLN). This method primarily focuses on identifying road surfaces, potholes, and scattered objects. To mitigate the influence of real-world factors such as noise and uneven brightness on test results, pavement images were carefully collected. Initially, the collected images underwent grayscale processing. Subsequently, the median filtering algorithm was employed to filter out noise interference. Furthermore, adaptive histogram equalization techniques were utilized to enhance the visibility of cracks and the road background. Following these preprocessing steps, the MSLN model was deployed for the detection of road risk sources. Addressing the challenges associated with two-stage network models, such as prolonged training and testing times, as well as deployment difficulties, this study adopted the lightweight feature extraction network MobileNetV2. Additionally, transfer learning was incorporated to elevate the model's training efficiency. Moreover, this paper established a mapping relationship model that transitions from the world coordinate system to the pixel coordinate system. This model enables the calculation of risk source dimensions based on detection outcomes. Experimental results reveal that the MSLN model exhibits a notably faster convergence rate. This enhanced convergence not only boosts training speed but also elevates the precision of risk source detection. Furthermore, the proposed mapping relationship coordinate transformation model proves highly effective in determining the scale of risk sources. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
32. A Lightweight Pathological Gait Recognition Approach Based on a New Gait Template in Side-View and Improved Attention Mechanism.
- Author
-
Li, Congcong, Wang, Bin, Li, Yifan, and Liu, Bo
- Subjects
- *
GAIT disorders , *PROBLEM solving , *INFORMATION processing , *PATHOLOGY , *CLASSIFICATION - Abstract
As people age, abnormal gait recognition becomes a critical problem in the field of healthcare. Currently, some algorithms can classify gaits with different pathologies, but they cannot guarantee high accuracy while keeping the model lightweight. To address these issues, this paper proposes a lightweight network (NSVGT-ICBAM-FACN) based on the new side-view gait template (NSVGT), improved convolutional block attention module (ICBAM), and transfer learning that fuses convolutional features containing high-level information and attention features containing semantic information of interest to achieve robust pathological gait recognition. The NSVGT contains different levels of information such as gait shape, gait dynamics, and energy distribution at different parts of the body, which integrates and compensates for the strengths and limitations of each feature, making gait characterization more robust. The ICBAM employs parallel concatenation and depthwise separable convolution (DSC). The former strengthens the interaction between features. The latter improves the efficiency of processing gait information. In the classification head, we choose to employ DSC instead of global average pooling. This method preserves the spatial information and learns the weights of different locations, which solves the problem that the corner points and center points in the feature map have the same weight. The classification accuracies for this paper's model on the self-constructed dataset and GAIT-IST dataset are 98.43% and 98.69%, which are 0.77% and 0.59% higher than that of the SOTA model, respectively. The experiments demonstrate that the method achieves good balance between lightweightness and performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
33. Detection of Oil Spill in SAR Image Using an Improved DeepLabV3+.
- Author
-
Zhang, Jiahao, Yang, Pengju, and Ren, Xincheng
- Subjects
- *
OIL spills , *PROBLEM solving , *SPINE , *GENERALIZATION , *NOISE , *SYNTHETIC aperture radar - Abstract
Oil spill SAR images are characterized by high noise, low contrast, and irregular boundaries, which lead to the problems of overfitting and insufficient capturing of detailed features of the oil spill region in the current method when processing oil spill SAR images. An improved DeepLabV3+ model is proposed to address the above problems. First, the original backbone network Xception is replaced by the lightweight MobileNetV2, which significantly improves the generalization ability of the model while drastically reducing the number of model parameters and effectively addresses the overfitting problem. Further, the spatial and channel Squeeze and Excitation module (scSE) is introduced and the joint loss function of Bce + Dice is adopted to enhance the sensitivity of the model to the detailed parts of the oil spill area, which effectively solves the problem of insufficient capture of the detailed features of the oil spill area. The experimental results show that the mIOU and F1-score of the improved model in an oil spill region in the Gulf of Mexico reach 80.26% and 88.66%, respectively. In an oil spill region in the Persian Gulf, the mIOU and F1-score reach 81.34% and 89.62%, respectively, which are better than the metrics of the control model. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
34. A Method for Surveying Road Pavement Distress Based on Front-View Image Data Using a Lightweight Segmentation Approach.
- Author
-
Yang, Yuanji, Wang, Hui, Kang, Junyang, and Xu, Zhoucong
- Subjects
- *
PAVEMENT management , *PAVEMENTS , *ROAD markings , *INFRASTRUCTURE (Economics) , *MANUAL labor , *SPINE , *VIDEO recording - Abstract
The utilization of low-cost video data is becoming more prevalent in pavement surveys to meet the increasing demand for timely distress detection and repair. Semantic segmentation algorithms can effectively segment pavement features and distresses simultaneously. Previous studies on pavement distress segmentation have primarily focused on cracks, and most multiobjective segmentation algorithms are not accurate or efficient. This paper presents a new method for pavement segmentation using a lightweight network segmentation model that employs DeepLabV3+ with MobileNetV2 as the backbone and a convolution block attention module to extract effective information in the encoder. The authors constructed a self-created data set called ChongQing University Pavement management (CQUPM), which includes five pavement features and six types of distress. Based on the CQUPM data set and a publicly available data set, RTK, the proposed model demonstrates superior accuracy and complexity compared to DeepLabv3+, U-Net, and Segformer-b3. Its lightweight nature is particularly noteworthy, with a parameter size of only about 1/10 to 1/4 that of other models based on the same data set. The case analysis highlights the exceptional performance of the proposed model, especially in scenarios where multiple types of pavement distress overlap. Furthermore, the model excels in edge segmentation and shows good generalization performance, indicating strong potential for practical applications. Practical Applications: Maintenance management organizations at the grassroots level, in certain regions or serving specific projects, often face significant daily workloads. Routine survey work is primarily reliant on manual labor due to the high acquisition and operating costs of detection equipment. The segmentation model, trained on a small data set constructed from front-view images, can complete the survey of 2–3 lanes at a time. This model enables the detection of pavement type, pavement marking, and distress information. The model's excellent generalization capabilities and the small data set lower the technical threshold of the application. This approach can be applied to other transportation infrastructures to address similar management problems. By using low-cost video recording devices to capture video data and quickly construct small data sets, training, and applications based on semantic segmentation techniques, problems can be identified in a timely manner without relying on human labor. This method has strong potential for replication. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
35. IoT-MFaceNet: Internet-of-Things-Based Face Recognition Using MobileNetV2 and FaceNet Deep-Learning Implementations on a Raspberry Pi-400.
- Author
-
Mohammad, Ahmad Saeed, Jarullah, Thoalfeqar G., Al-Kaltakchi, Musab T. S., Alshehabi Al-Ani, Jabir, and Dey, Somdip
- Subjects
COMPUTER engineering ,DATABASES ,RASPBERRY Pi ,SUPPORT vector machines ,SYSTEM identification - Abstract
IoT applications revolutionize industries by enhancing operations, enabling data-driven decisions, and fostering innovation. This study explores the growing potential of IoT-based facial recognition for mobile devices, a technology rapidly advancing within the interconnected IoT landscape. The investigation proposes a framework called IoT-MFaceNet (Internet-of-Things-based face recognition using MobileNetV2 and FaceNet deep-learning) utilizing pre-existing deep-learning methods, employing the MobileNetV2 and FaceNet algorithms on both ImageNet and FaceNet databases. Additionally, an in-house database is compiled, capturing data from 50 individuals via a web camera and 10 subjects through a smartphone camera. Pre-processing of the in-house database involves face detection using OpenCV's Haar Cascade, Dlib's CNN Face Detector, and Mediapipe's Face. The resulting system demonstrates high accuracy in real-time and operates efficiently on low-powered devices like the Raspberry Pi 400. The evaluation involves the use of the multilayer perceptron (MLP) and support vector machine (SVM) classifiers. The system primarily functions as a closed set identification system within a computer engineering department at the College of Engineering, Mustansiriyah University, Iraq, allowing access exclusively to department staff for the department rapporteur room. The proposed system undergoes successful testing, achieving a maximum accuracy rate of 99.976%. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
36. Görme Engelli Bireylerin Günlük Yaşamda Karşılaştıkları Zorluklara Yenilikçi Bir Çözüm: Derin Öğrenme Tabanlı Akıllı Asistan Tasarımı ve Geliştirilmesi
- Author
-
Yalçınkaya, Mehmet Ali, Işık, Murat, Kaşçıoğlu, Elanur, and Kaya, Hatice Nur
- Abstract
Copyright of Dicle University Journal of Engineering / Dicle Üniversitesi Mühendislik Dergisi is the property of Dicle Universitesi and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
37. The Study of the Effectiveness and Efficiency of Multiple DCNN Models for Breast Cancer Diagnosis Using a Small Mammography Dataset.
- Author
-
Laaffat, Nourane, Outfarouin, Ahmad, Bouarifi, Walid, and Jraifi, Abdelilah
- Subjects
CONVOLUTIONAL neural networks ,CANCER diagnosis ,DEEP learning ,MAMMOGRAMS ,BREAST cancer - Abstract
Breast cancer (BC), the most prevalent cancer worldwide, poses a significant threat to women's health, often resulting in mortality. Early intervention is crucial for reducing mortality rates and improving recovery. Mammography plays a pivotal role in early detection through high-resolution imaging. Various classification techniques, including classical and deep learning (DL) methods, assist in diagnosing BC. Convolutional neural networks (CNN)-based classification with transfer learning enhances efficiency and accuracy, especially with limited datasets. This study evaluates the performance of different pretrained deep CNN architectures in classifying pathological mammography scans from the Mini-MIAS dataset. The results show that Xception, VGG16, VGG19, and MobileNetV2 achieve the highest accuracy (97%), with VGG19 demonstrating the fastest prediction speed (0.53 s). [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
38. A lightweight semantic segmentation based on attention mechanism.
- Author
-
MA Dong-mei, WANG Peng-yu, and GUO Zhi-hao
- Abstract
Abstract:Semantic segmentation is a computer vision technique that requires extracting focused information from a large number of images and then transforming this information into a clearer and easierto- understand representation by means of a mask. Researchers are trying to find a balance in order to minimize the size of the model while ensuring its accuracy, which is currently a hot topic in designing lightweight network models. Currently, there are many challenges in image semantic segmentation techniques, such as segmentation discontinuity, incorrect segmentation, and high model complexity. To solve these problems, a lightweight semantic segmentation model based on attention mechanism is proposed. It uses freeze-thaw training, and the feature extraction network is MobileNetV2. To recover clearer target boundaries, a lightweight convolutional attention (CBAM) module is introduced in the output part of the atrous spatial pyramid pooling (ASPP) or channel attention (ECA-Net) in the decoding part. To solve the sample imbalance problem, the focal_loss loss function is introduced. Mixed accuracy is used, and the standard convolution in the output section is replaced with DO-Conv convolution. Experiments and validations are conducted on the PASCAL VOC2012 and Cityscapes datasets. The model size is 23.6 MB, with mean intersection over union (mIoU) scores of 73.91% and 74.89%, and class-wise pixel accuracy scores of 82.88% and 84.87% respectively. This successfully achieves a balance between accurate segmentation and computational efficiency. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
39. Vehicle type recognition: a case study of MobileNetV2 for an image Classification task.
- Author
-
Kobiela, Dariusz, Groth, Jan, Hajdasz, Michał, and Erezman, Mateusz
- Subjects
IMAGE recognition (Computer vision) ,DATA augmentation ,VEHICLE models ,ACQUISITION of data ,ELECTRONIC data processing - Abstract
The goal of the research was to demonstrate the full data science lifecycle through a use case of the MobileNetv2 model for vehicle image Classification task using various validation and test sets, each with different difficulty level. Diverse model variations were employed, each designed to recognize images of ground vehicles and classify them into one of five possible classes: car, truck, motorcycle, bicycle, or bus. In terms of validation accuracy, the highest results were obtained by the model trained with uniformly designed train and val sets (with data normalization and augmentation), where train set also contained validation set. This model also obtained the highest accuracy results on both test sets. The superiority of MODEL 3 BASELINE is confirmed by other metrics as well: test loss, f1-score, AUC and confusion matrices (for both test sets). Results between MODEL 1 BASELINE and MODEL 2 BASELINE differed according to the test set 1 and 2 and other metrics and it was not possible to declare the superiority of one method of datasets preparation over another (original class distribution [no data normalization and no data augmentation] versus uniformly designed [with data normalization and augmentation]). The article also presents challenges and findings - the problems, key issues, and their solutions that arose during the process of data collection and tagging, as well as the preparation and evaluation of the model. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
40. Enhancing Weather Scene Identification Using Vision Transformer.
- Author
-
Dewi, Christine, Arshed, Muhammad Asad, Christanto, Henoch Juli, Rehman, Hafiz Abdul, Muneer, Amgad, and Mumtaz, Shahzad
- Subjects
TRANSFORMER models ,COMPUTER vision ,WEATHER forecasting ,FEATURE extraction ,INTELLIGENT networks - Abstract
The accuracy of weather scene recognition is critical in a world where weather affects every aspect of our everyday lives, particularly in areas like intelligent transportation networks, autonomous vehicles, and outdoor vision systems. The importance of weather in many aspects of our life highlights the vital necessity for accurate information. Precise weather detection is especially crucial for industries like intelligent transportation, outside vision systems, and driverless cars. The outdated, unreliable, and time-consuming manual identification techniques are no longer adequate. Unmatched accuracy is required for local weather scene forecasting in real time. This work utilizes the capabilities of computer vision to address these important issues. Specifically, we employ the advanced Vision Transformer model to distinguish between 11 different weather scenarios. The development of this model results in a remarkable performance, achieving an accuracy rate of 93.54%, surpassing industry standards such as MobileNetV2 and VGG19. These findings advance computer vision techniques into new domains and pave the way for reliable weather scene recognition systems, promising extensive real-world applications across various industries. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
41. Lung tumor cell classification with lightweight mobileNetV2 and attention-based SCAM enhanced faster R-CNN.
- Author
-
Jenipher, V. Nisha and Radhika, S.
- Abstract
Early and precise detection of lung tumor cell is paramount for providing adequate medication and increasing the survivability of the patients. To achieve this, the Enhanced Faster R-CNN with MobileNetV2 and SCAM framework is bestowed for improving the diagnostic accuracy of lung tumor cell classification. The U-Net architecture optimized by Stochastic Gradient Descent (SGD) is employed to carry out clinical image segmentation. The developed approach leverages the advantage of the lightweight design MobileNetV2 backbone network and the attention mechanism called Spatial and Channel Attention Module (SCAM) for improving the feature extraction as well as the feature representation and localization process of lung tumor cell. The proposed method integrated a MobileNetV2 backbone network due to its lightweight design for deriving valuable features of the input clinical images to reduce the complexity of the network architecture. Moreover, it also incorporates the attention module SCAM for the creation of spatially and channel wise informative features to enhance the lung tumor cell features representation and also its localization to concentrate on important locations. To assess the efficacy of the method, several high performance lung tumor cell classification techniques ECNN, Lung-Retina Net, CNN-SVM, CCDC-HNN, and MTL-MGAN, and datasets including Lung-PET-CT-Dx dataset, LIDC-IDRI dataset, and Chest CT-Scan images dataset are taken to carry out experimental evaluation. By conducting the comprehensive comparative analysis for different metrics with respect to different methods, the proposed method obtains the impressive performance rate with accuracy of 98.6%, specificity of 96.8%, sensitivity of 97.5%, and precision of 98.2%. Furthermore, the experimental outcomes also reveal that the proposed method reduces the complexity of the network and obtains improved diagnostic outcomes with available annotated data. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
42. 基于改进 PSPnet‑MobileNetV2 的煤岩界面快速 精准识别.
- Author
-
王海舰, 刘丽丽, 赵雪梅, and 张 强
- Abstract
Copyright of Journal of Vibration, Measurement & Diagnosis is the property of Nanjing Hangkong Daxue and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
43. 基于轻量化的 DeepLabV3+遥感图像 地物分割方法.
- Author
-
马 静, 郭中华, 马志强, 马小艳, and 李迦龙
- Subjects
REMOTE sensing ,PROBLEM solving ,ATTENTION - Abstract
Copyright of Chinese Journal of Liquid Crystal & Displays is the property of Chinese Journal of Liquid Crystal & Displays and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
44. An Identification Method for Mixed Coal Vitrinite Components Based on An Improved DeepLabv3+ Network.
- Author
-
Wang, Fujie, Li, Fanfan, Sun, Wei, Song, Xiaozhong, and Lu, Huishan
- Subjects
- *
VITRINITE , *COAL , *PYRAMIDS , *PIXELS , *RECOGNITION (Psychology) - Abstract
To address the high complexity and low accuracy issues of traditional methods in mixed coal vitrinite identification, this paper proposes a method based on an improved DeepLabv3+ network. First, MobileNetV2 is used as the backbone network to reduce the number of parameters. Second, an atrous convolution layer with a dilation rate of 24 is added to the ASPP (atrous spatial pyramid pooling) module to further increase the receptive field. Meanwhile, a CBAM (convolutional block attention module) attention mechanism with a channel multiplier of 8 is introduced at the output part of the ASPP module to better filter out important semantic features. Then, a corrective convolution module is added to the network's output to ensure the consistency of each channel's output feature map for each type of vitrinite. Finally, images of 14 single vitrinite components are used as training samples for network training, and a validation set is used for identification testing. The results show that the improved DeepLabv3+ achieves 6.14% and 3.68% improvements in MIOU (mean intersection over union) and MPA (mean pixel accuracy), respectively, compared to the original DeepLabv3+; 12% and 5.3% improvements compared to U-Net; 9.26% and 4.73% improvements compared to PSPNet with ResNet as the backbone; 5.4% and 9.34% improvements compared to PSPNet with MobileNetV2 as the backbone; and 6.46% and 9.05% improvements compared to HRNet. Additionally, the improved ASPP module increases MIOU and MPA by 3.23% and 1.93%, respectively, compared to the original module. The CBAM attention mechanism with a channel multiplier of 8 improves MIOU and MPA by 1.97% and 1.72%, respectively, compared to the original channel multiplier of 16. The data indicate that the proposed identification method significantly improves recognition accuracy and can be effectively applied to mixed coal vitrinite identification. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
45. AI-Based Recognition of Fruit and Vegetable Spoilage: Towards Household Food Waste Reduction.
- Author
-
Sofian, Madeline Andrea, Putri, Abygael Adrianty, Edbert, Ivan Sebastian, and Aulia, Alvina
- Subjects
SHELF-life dating of food ,FOOD waste ,WASTE minimization ,SUSTAINABLE consumption ,ARTIFICIAL intelligence - Abstract
Food waste is a critical global issue, with approximately one-third of produced food being wasted annually. Indonesia, contributing around 20.93 million tonnes to this waste, faces several economic and environmental impacts. The primary causes of food waste in Indonesian households include a lack of knowledge about food storage and expiration dates, along with the unreliability of traditional freshness assessment methods. This research will evaluate three pre-trained AI models – MobileNetV2, VGG19, and EfficientNetV2S – using the 'Fresh and Rotten Classification' dataset from Kaggle, which contains 30.4k images depicting both fresh and rotten produce, in hopes to extend this research to integrate these models to AI-based tools to recognize spoilage in produce, thereby reducing household food waste. EfficientNetV2S emerged as the most effective model, achieving a 97.61% accuracy and a 97.59% F1-score, indicating robust performance and suitability for real-time applications in households. Despite promising results, challenges such as dataset quality and class imbalance were noted. Future research should focus on improving models' performance through hyperparameter tuning, transfer learning, and developing comprehensive datasets. Integrating these models into user friendly applications could significantly contribute to reducing food taste and promoting sustainable consumption habits in Indonesia. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
46. OpenCV Based Customer Screening System for Prevention of COVID-19 Transmission in Retail Stores.
- Author
-
Shah, Jai Jayesh, Ragu, Harini, David, Valerie, Sasikumar, P., and Subburaj, Maheswari
- Subjects
MEDICAL screening ,RETAIL stores ,CONSUMERS ,VACCINATION status ,PUBLIC spaces ,INTERNET servers - Abstract
The COVID-19 pandemic has taken the world by a storm for over three years now. With new variants and developments constantly being brought to light, it has become imperative for citizens to get vaccinated and maintain social distancing in public places, such as retail stores, in order to curb the spread of the virus. Statistics suggest that being fully vaccinated can prove to be, on average, 81% effective against the COVID-19 virus. This project aims at developing a customer screening system that dynamically counts the number of customers inside a store at a particular point in time and allows conditional entry only if the customer is fully vaccinated and wearing a face mask. This system is developed using Open-CV and MobileNetV2 for detection of facial characteristics, and achieved 98% accuracy for face mask detection. Further, a QR Scanner has been employed in conjunction with the Indian government-authorized CoWIN portal for verification of vaccination details, and Flask for the development of a web server for implementation. Additionally, key demographic information pertaining to customers and entry logs have been automatically populated into an Excel-based database for predictive crowd analysis. In doing so, the manual labour required for the customer screening process has been reduced to zero whereas the health and safety of individuals is promoted, consequentially curbing the spread of the virus. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
47. A lightweight hybrid model for the automatic recognition of uterine fibroid ultrasound images based on deep learning.
- Author
-
Cai, Peiya, Yang, Tiantian, Xie, Qinglai, Liu, Peizhong, and Li, Ping
- Abstract
Purpose: Uterine fibroids (UF) are the most frequent tumors in ladies and can pose an enormous threat to complications, such as miscarriage. The accuracy of prognosis may also be affected by way of doctor inexperience and fatigue, underscoring the want for automatic classification fashions that can analyze UF from a giant wide variety of images. Methods: A hybrid model has been proposed that combines the MobileNetV2 community and deep convolutional generative adversarial networks (DCGAN) into useful resources for medical practitioners in figuring out UF and evaluating its characteristics. Real‐time automated classification of UF can aid in diagnosing the circumstance and minimizing subjective errors. The DCGAN science is utilized for superior statistics augmentation to create first‐rate UF images, which are labeled into UF and non‐uterine‐fibroid (NUF) classes. The MobileNetV2 model then precisely classifies the photos based totally on this data. Results: The overall performance of the hybrid model contrasts with different models. The hybrid model achieves a real‐time classification velocity of 40 frames per second (FPS), an accuracy of 97.45%, and an F1 rating of 0.9741. Conclusion: By using this deep learning hybrid approach, we address the shortcomings of the current classification methods of uterine fibroid. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
48. An Improved Lightweight YOLOv5s-Based Method for Detecting Electric Bicycles in Elevators.
- Author
-
Zhang, Ziyuan, Yang, Xianyu, and Wu, Chengyu
- Subjects
ELEVATORS ,ELECTRIC bicycles ,ELECTRIC charge ,RASPBERRY Pi ,COMPUTATIONAL complexity ,ELECTRIC vehicles - Abstract
The increase in fire accidents caused by indoor charging of electric bicycles has raised concerns among people. Monitoring EBs in elevators is challenging, and the current object detection method is a variant of YOLOv5, which faces problems with calculating the load and detection rate. To address this issue, this paper presents an improved lightweight method based on YOLOv5s to detect EBs in elevators. This method introduces the MobileNetV2 module to achieve the lightweight performance of the model. By introducing the CBAM attention mechanism and the Bidirectional Feature Pyramid Network (BiFPN) into the YOLOv5s neck network, the detection precision is improved. In order to better verify that the model can be deployed at the edge of an elevator, this article deploys it using the Raspberry Pi 4B embedded development board and connects it to a buzzer for application verification. The experimental results demonstrate that the model parameters of EBs are reduced by 58.4%, the computational complexity is reduced by 50.6%, the detection precision reaches 95.9%, and real-time detection of electric vehicles in elevators is achieved. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
49. A Combined MobileNetV2 and CBAM Model to Improve Classifying the Breast Cancer Ultrasound Images
- Author
-
Muhammad Rakha, Mahmud Dwi Sulistiyo, Dewi Nasien, and Muhammad Ridha
- Subjects
MobileNetV2 ,CBAM ,Image Classification ,Breast Cancer ,Ultrasound ,Engineering (General). Civil engineering (General) ,TA1-2040 ,Technology (General) ,T1-995 - Abstract
Breast cancer is the main cause of death in women throughout the world. Early detection using ultrasound is very necessary to reduce cases of breast cancer. However, the ultrasound analysis process requires a lot of time and medical personnel because classification is difficult due to noise, complex texture, and subjective assessment. Previous studies were successful in ultrasound classification of breast cancer but required large computations and complex models. This research aims to overcome these shortcomings by using a lighter but more accurate model. We integrated the CBAM attention module into the MobileNetV2 model to improve breast cancer detection accuracy, speed up diagnosis, and reduce computational requirements. Gradient Weighted Class Activation Mapping (Grad-CAM) is used to improve classification explanations. Ultrasound images from two databases were combined to train, validate, and test this model. The test results show that MobileNetV2-CBAM achieves a test accuracy of 93%, higher than the complex models VGG-16 (80%), VGG-19 (82%), InceptionV3 (80%), and ResNet-50 (84%). CBAM is proven to improve MobileNetV2 performance with an 11% increase in accuracy. Grad-CAM visualization shows that MobileNetV2-CBAM can better focus on localizing important regions in breast cancer images, providing clearer explanations and assisting medical personnel in diagnosis.
- Published
- 2024
- Full Text
- View/download PDF
50. A novel infrared thermography image analysis for transformer condition monitoring
- Author
-
Rupali Balabantaraya, Ashwin Kumar Sahoo, Prabodh Kumar Sahoo, Chayan Mondal Abir, and Manoj Kumar Panda
- Subjects
Infrared Images ,KNN ,DT ,Deep learning ,MobileNetV2 ,VGG-16 ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Electrical systems are deeply ingrained in most industrial facilities, their maintenance is increasingly becoming a critical and significant component of economic policy. Condition monitoring of electrical transformers is essential for improving their dependability and availability, averting costly maintenance and additional significant breakdowns. This research adopts the approach based on infrared thermography techniques (IRT) to keep eye on electrical transformers and detect their defects. Thermal images of the transformer were captured at two distinct operating states with an infrared camera. These images were then compiled into a dataset for further analysis. This method uses infrared imaging (IRT), along with feature analysis and machine learning, to identify issues in electrical transformers in a new way. To find the best performing machine learning model, different techniques are compared here in terms of their accuracy and stability. Two approaches are investigated for identifying features in thermography images. Approach-1 employed five common machine learning algorithms, such as Support Vector Machine (SVM), K-Nearest Neighbours (KNN), Decision Tree (DT), Logistic Regression (LR), and Least Squares Support Vector Machine (LS-SVM). Approach-2 utilized four deep learning techniques, such as MobileNetV2 (MNV2), InceptionV3 (InV3), DenseNet121(DN121), and our proposed modified VGG-16. Among all evaluated methods, the modified VGG-16 architecture achieved the highest level of dependability, demonstrating exceptional efficiency and accuracy in transformer condition monitoring and fault diagnosis.
- Published
- 2024
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.