Publication Year Range: This year / Search Limiters: 3 selected / Topic: deep learning - Searchworks@Jio Institute Digital Library Search Results

1. A deep learning-based approach for performance assessment and prediction: A case study of pulp and paper industries

Author: Jauhar, Sunil Kumar, Raj, Praveen Vijaya Raj Pushpa, Kamble, Sachin, Pratap, Saurabh, Gupta, Shivam, and Belhadi, Amine
Published: 2024
Full Text: View/download PDF

2. Web-based diagnostic platform for microorganism-induced deterioration on paper-based cultural relics with iterative training from human feedback

Author: Chenshu Liu, Songbin Ben, Chongwen Liu, Xianchao Li, Qingxia Meng, Yilin Hao, Qian Jiao, and Pinyi Yang
Subjects: Paper-based cultural relics, Conservation, Computer vision, Deep learning, Strain classification, Web application, Fine Arts, Analytical chemistry, QD71-142
Abstract: Abstract Purpose Paper-based artifacts hold significant cultural and social values. However, paper is intrinsically fragile to microorganisms, such as mold, due to its cellulose composition, which can serve as a microorganisms’ nutrient source. Mold not only can damage papers’ structural integrity and pose significant challenges to conservation works but also may subject individuals attending the contaminated artifacts to health risks. Current approaches for strain identification usually require extensive training, prolonged time for analysis, expensive operation costs, and higher risks of secondary damage due to sampling. Thus, in current conservation practices with mold-contaminated artifacts, little pre-screening or strain identification was performed before mold removal, and the cleaning techniques are usually broad-spectrum rather than strain-specific. With deep learning showing promising applications across various domains, this study investigated the feasibility of using a convolutional neural network (CNN) for fast in-situ recognition and classification of mold on paper. Methods Molds were first non-invasively sampled from ancient Xuan Paper-based Chinese books from the Qing and Ming dynasties. Strains were identified using molecular biology methods and the four most prevalent strains were inoculated on Xuan paper to create mockups for image collection. Microscopic images of the molds as well as their stains situated on paper were collected using a compound microscope and commercial microscope lens for cell phone cameras, which were then used for training CNN models with a transfer learning scheme to perform the classification of mold. To enable involvement and contribution from the research community, a web interface that actuates the process while providing interactive features for users to learn about the information of the classified strain was constructed. Moreover, a feedback functionality in the web interface was embedded for catching potential classification errors, adding additional training images, or introducing new strains, all to refine the generalizability and robustness of the model. Results & Conclusion In the study, we have constructed a suite of high-confidence classification CNN models for the diagnostic process for mold contamination in conservation. At the same time, a web interface was constructed that allows recurrently refining the model with human feedback through engaging the research community. Overall, the proposed framework opens new avenues for effective and timely identification of mold, thus enabling proactive and targeted mold remediation strategies in conservation.
Published: 2024
Full Text: View/download PDF

3. NSTU-BDTAKA: An open dataset for Bangladeshi paper currency detection and recognition

Author: Md. Jubayar Alam Rafi, Mohammad Rony, and Nazia Majadi
Subjects: Computer vision, Deep learning, Image analysis, Taka detection, Taka recognition, YOLOv5 model, Computer applications to medicine. Medical informatics, R858-859.7, Science (General), Q1-390
Abstract: One of the most popular and well-established forms of payment in use today is paper money. Handling paper money might be challenging for those with vision impairments. Assistive technology has been reinventing itself throughout time to better serve the elderly and disabled people. To detect paper currency and extract other useful information from them, image processing techniques and other advanced technologies, such as Artificial Intelligence, Deep Learning, etc., can be used. In this paper, we present a meticulously curated and comprehensive dataset named ‘NSTU-BDTAKA’ tailored for the simultaneous detection and recognition of a specific object of cultural significance - the Bangladeshi paper currency (in Bengali it is called ‘Taka’). This research aims to facilitate the development and evaluation of models for both taka detection and recognition tasks, offering a rich resource for researchers and practitioners alike. The dataset is divided into two distinct components: (i) taka detection, and (ii) taka recognition. The taka detection subset comprises 3,111 high-resolution images, each meticulously annotated with rectangular bounding boxes that encompass instances of the taka. These annotations serve as ground truth for training and validating object detection models, and we adopt the state-of-the-art YOLOv5 architecture for this purpose. In the taka recognition subset, the dataset has been extended to include a vast collection of 28,875 images, each showcasing various instances of the taka captured in diverse contexts and environments. The recognition dataset is designed to address the nuanced task of taka recognition providing researchers with a comprehensive set of images to train, validate, and test recognition models. This subset encompasses challenges such as variations in lighting, scale, orientation, and occlusion, further enhancing the robustness of developed recognition algorithms. The dataset NSTU-BDTAKA not only serves as a benchmark for taka detection and recognition but also fosters advancements in object detection and recognition methods that can be extrapolated to other cultural artifacts and objects. We envision that the dataset will catalyze research efforts in the field of computer vision, enabling the development of more accurate, robust, and efficient models for both detection and recognition tasks.
Published: 2024
Full Text: View/download PDF

4. Web-based diagnostic platform for microorganism-induced deterioration on paper-based cultural relics with iterative training from human feedback.

Author: Liu, Chenshu, Ben, Songbin, Liu, Chongwen, Li, Xianchao, Meng, Qingxia, Hao, Yilin, Jiao, Qian, and Yang, Pinyi
Subjects: *CONVOLUTIONAL neural networks, *COMMUNITY involvement, *DEEP learning, *CLASSIFICATION, *CAMERA phones, *RELICS, *AUTOMATIC classification, *SECURITY classification (Government documents)
Abstract: Purpose: Paper-based artifacts hold significant cultural and social values. However, paper is intrinsically fragile to microorganisms, such as mold, due to its cellulose composition, which can serve as a microorganisms' nutrient source. Mold not only can damage papers' structural integrity and pose significant challenges to conservation works but also may subject individuals attending the contaminated artifacts to health risks. Current approaches for strain identification usually require extensive training, prolonged time for analysis, expensive operation costs, and higher risks of secondary damage due to sampling. Thus, in current conservation practices with mold-contaminated artifacts, little pre-screening or strain identification was performed before mold removal, and the cleaning techniques are usually broad-spectrum rather than strain-specific. With deep learning showing promising applications across various domains, this study investigated the feasibility of using a convolutional neural network (CNN) for fast in-situ recognition and classification of mold on paper. Methods: Molds were first non-invasively sampled from ancient Xuan Paper-based Chinese books from the Qing and Ming dynasties. Strains were identified using molecular biology methods and the four most prevalent strains were inoculated on Xuan paper to create mockups for image collection. Microscopic images of the molds as well as their stains situated on paper were collected using a compound microscope and commercial microscope lens for cell phone cameras, which were then used for training CNN models with a transfer learning scheme to perform the classification of mold. To enable involvement and contribution from the research community, a web interface that actuates the process while providing interactive features for users to learn about the information of the classified strain was constructed. Moreover, a feedback functionality in the web interface was embedded for catching potential classification errors, adding additional training images, or introducing new strains, all to refine the generalizability and robustness of the model. Results & Conclusion: In the study, we have constructed a suite of high-confidence classification CNN models for the diagnostic process for mold contamination in conservation. At the same time, a web interface was constructed that allows recurrently refining the model with human feedback through engaging the research community. Overall, the proposed framework opens new avenues for effective and timely identification of mold, thus enabling proactive and targeted mold remediation strategies in conservation. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

5. Speech recognition for Kazakh language: a research paper.

Author: Kapyshev, Galym, Nurtas, Marat, and Altaibek, Aizhan
Subjects: SPEECH perception, AUTOMATIC speech recognition, NATURAL language processing, LANGUAGE research, DEEP learning, MARKOV processes
Abstract: In recent years, the research pertaining to speech recognition technology in the Kazakh language has gained significant importance. This is due to the increasing demand for natural language processing applications in the region where Kazakh is predominantly spoken. Thus, there exists an urgent requirement for precise and dependable speech recognition systems. The research study examines the application of sophisticated deep learning methodologies, such as Natural Language Processing (NLP) and Hidden Markov Model (HMM), in facilitating speech recognition for the Kazakh language. Additionally, the investigation delves into how various techniques, including data preprocessing, acoustic modeling, and language modeling, can aid in devising effective speech recognition systems. The article deliberates on the feasible uses of speech recognition technology in the geographic area where Kazakh language is spoken and outlines its future research prospects. The investigation underscores the significance of persistent inquiry in this realm to confront distinctive obstacles encountered in creating speech recognition systems for languages with restricted resources. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

6. Computer vision digitization of smartphone images of anesthesia paper health records from low-middle income countries.

Author: Folks, Ryan D., Naik, Bhiken I., Brown, Donald E., and Durieux, Marcel E.
Subjects: *MEDICAL records, *ARTIFICIAL neural networks, *COMPUTER vision, *DIASTOLIC blood pressure, *MEDICAL personnel, *DEEP learning, *SYSTOLIC blood pressure
Abstract: Background: In low-middle income countries, healthcare providers primarily use paper health records for capturing data. Paper health records are utilized predominately due to the prohibitive cost of acquisition and maintenance of automated data capture devices and electronic medical records. Data recorded on paper health records is not easily accessible in a digital format to healthcare providers. The lack of real time accessible digital data limits healthcare providers, researchers, and quality improvement champions to leverage data to improve patient outcomes. In this project, we demonstrate the novel use of computer vision software to digitize handwritten intraoperative data elements from smartphone photographs of paper anesthesia charts from the University Teaching Hospital of Kigali. We specifically report our approach to digitize checkbox data, symbol-denoted systolic and diastolic blood pressure, and physiological data. Methods: We implemented approaches for removing perspective distortions from smartphone photographs, removing shadows, and improving image readability through morphological operations. YOLOv8 models were used to deconstruct the anesthesia paper chart into specific data sections. Handwritten blood pressure symbols and physiological data were identified, and values were assigned using deep neural networks. Our work builds upon the contributions of previous research by improving upon their methods, updating the deep learning models to newer architectures, as well as consolidating them into a single piece of software. Results: The model for extracting the sections of the anesthesia paper chart achieved an average box precision of 0.99, an average box recall of 0.99, and an mAP0.5-95 of 0.97. Our software digitizes checkbox data with greater than 99% accuracy and digitizes blood pressure data with a mean average error of 1.0 and 1.36 mmHg for systolic and diastolic blood pressure respectively. Overall accuracy for physiological data which includes oxygen saturation, inspired oxygen concentration and end tidal carbon dioxide concentration was 85.2%. Conclusions: We demonstrate that under normal photography conditions we can digitize checkbox, blood pressure and physiological data to within human accuracy when provided legible handwriting. Our contributions provide improved access to digital data to healthcare practitioners in low-middle income countries. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

7. Deep Learning for 3D Reconstruction, Augmentation, and Registration: A Review Paper

Author: Prasoon Kumar Vinodkumar, Dogus Karabulut, Egils Avots, Cagri Ozcinar, and Gholamreza Anbarjafari
Subjects: deep learning, 3D reconstruction, 3D augmentation, 3D registration, point cloud, voxel, Science, Astrophysics, QB460-466, Physics, QC1-999
Abstract: The research groups in computer vision, graphics, and machine learning have dedicated a substantial amount of attention to the areas of 3D object reconstruction, augmentation, and registration. Deep learning is the predominant method used in artificial intelligence for addressing computer vision challenges. However, deep learning on three-dimensional data presents distinct obstacles and is now in its nascent phase. There have been significant advancements in deep learning specifically for three-dimensional data, offering a range of ways to address these issues. This study offers a comprehensive examination of the latest advancements in deep learning methodologies. We examine many benchmark models for the tasks of 3D object registration, augmentation, and reconstruction. We thoroughly analyse their architectures, advantages, and constraints. In summary, this report provides a comprehensive overview of recent advancements in three-dimensional deep learning and highlights unresolved research areas that will need to be addressed in the future.
Published: 2024
Full Text: View/download PDF

8. Deep Learning for Automated Visual Inspection in Manufacturing and Maintenance: A Survey of Open- Access Papers

Author: Nils Hütten, Miguel Alves Gomes, Florian Hölken, Karlo Andricevic, Richard Meyes, and Tobias Meisen
Subjects: automated visual inspection, industrial applications, deep learning, computer vision, convolutional neural network, vision transformer, Technology, Applied mathematics. Quantitative methods, T57-57.97
Abstract: Quality assessment in industrial applications is often carried out through visual inspection, usually performed or supported by human domain experts. However, the manual visual inspection of processes and products is error-prone and expensive. It is therefore not surprising that the automation of visual inspection in manufacturing and maintenance is heavily researched and discussed. The use of artificial intelligence as an approach to visual inspection in industrial applications has been considered for decades. Recent successes, driven by advances in deep learning, present a possible paradigm shift and have the potential to facilitate automated visual inspection, even under complex environmental conditions. For this reason, we explore the question of to what extent deep learning is already being used in the field of automated visual inspection and which potential improvements to the state of the art could be realized utilizing concepts from academic research. By conducting an extensive review of the openly accessible literature, we provide an overview of proposed and in-use deep-learning models presented in recent years. Our survey consists of 196 open-access publications, of which 31.7% are manufacturing use cases and 68.3% are maintenance use cases. Furthermore, the survey also shows that the majority of the models currently in use are based on convolutional neural networks, the current de facto standard for image classification, object recognition, or object segmentation tasks. Nevertheless, we see the emergence of vision transformer models that seem to outperform convolutional neural networks but require more resources, which also opens up new research opportunities for the future. Another finding is that in 97% of the publications, the authors use supervised learning techniques to train their models. However, with the median dataset size consisting of 2500 samples, deep-learning models cannot be trained from scratch, so it would be beneficial to use other training paradigms, such as self-supervised learning. In addition, we identified a gap of approximately three years between approaches from deep-learning-based computer vision being published and their introduction in industrial visual inspection applications. Based on our findings, we additionally discuss potential future developments in the area of automated visual inspection.
Published: 2024
Full Text: View/download PDF

9. Special Issue "Emerging AI+X-Based Sensor and Networking Technologies including Selected Papers from ICGHIT 2022–2023".

Author: Kim, Byung-Seo, Afzal, Muhammad Khalil, and Ullah, Rehmat
Subjects: *MULTICASTING (Computer networks), *INFORMATION technology, *SENSOR networks, *ARTIFICIAL neural networks, *DEEP learning, *BEAM steering, *INTEGRATED circuit design, *COMPUTER network security
Abstract: This document is a summary of a special issue of the journal Sensors, titled "Emerging AI+X-Based Sensor and Networking Technologies including Selected Papers from ICGHIT 2022–2023." The special issue features selected papers from the 10th and 11th International Conferences on Green and Human Information Technology (ICGHITs), which were held in Korea and Thailand. The conferences focused on the theme of "Emerging Artificial Intelligent (AI)+X technology" and "Hyper Automation + Human AI" respectively. The selected papers cover various topics such as network security, routing protocols, signal detection, and clustering mechanisms, all incorporating AI-based methods. The issue also includes papers on topics like secure authentication, distance estimation in RFID systems, energy optimization in smart homes, blockchain technology, and radar signal detection. The authors emphasize the importance of both technology and humanity in advancing green and information technologies. [Extracted from the article]
Published: 2024
Full Text: View/download PDF

10. Deep Learning for 3D Reconstruction, Augmentation, and Registration: A Review Paper.

Author: Vinodkumar, Prasoon Kumar, Karabulut, Dogus, Avots, Egils, Ozcinar, Cagri, and Anbarjafari, Gholamreza
Subjects: *DEEP learning, *COMPUTER vision, *GRAPH neural networks, *ARTIFICIAL intelligence, *MACHINE learning, *GENERATIVE adversarial networks
Abstract: The research groups in computer vision, graphics, and machine learning have dedicated a substantial amount of attention to the areas of 3D object reconstruction, augmentation, and registration. Deep learning is the predominant method used in artificial intelligence for addressing computer vision challenges. However, deep learning on three-dimensional data presents distinct obstacles and is now in its nascent phase. There have been significant advancements in deep learning specifically for three-dimensional data, offering a range of ways to address these issues. This study offers a comprehensive examination of the latest advancements in deep learning methodologies. We examine many benchmark models for the tasks of 3D object registration, augmentation, and reconstruction. We thoroughly analyse their architectures, advantages, and constraints. In summary, this report provides a comprehensive overview of recent advancements in three-dimensional deep learning and highlights unresolved research areas that will need to be addressed in the future. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

11. SMR–YOLO: Multi-Scale Detection of Concealed Suspicious Objects in Terahertz Images.

Author: Zhang, Yuan, Chen, Hao, Ge, Zihao, Jiang, Yuying, Ge, Hongyi, Zhao, Yang, and Xiong, Haotian
Subjects: OBJECT recognition (Computer vision), PUBLIC spaces, WRAPPING materials, KRAFT paper, DETECTION alarms
Abstract: The detection of concealed suspicious objects in public places is a critical issue and a popular research topic. Terahertz (THz) imaging technology, as an emerging detection method, can penetrate materials without emitting ionizing radiation, providing a new approach to detecting concealed suspicious objects. This study focuses on the detection of concealed suspicious objects wrapped in different materials such as polyethylene and kraft paper, including items like scissors, pistols, and blades, using THz imaging technology. To address issues such as the lack of texture details in THz images and the contour similarity of different objects, which can lead to missed detections and false alarms, we propose a THz concealed suspicious object detection model based on SMR–YOLO (SPD_Mobile + RFB + YOLO). This model, based on the MobileNext network, introduces the spatial-to-depth convolution (SPD-Conv) module to replace the backbone network, reducing computational and parameter load. The inclusion of the receptive field block (RFB) module, which uses a multi-branch structure of dilated convolutions, enhances the network's depth features. Using the EIOU loss function to assess the accuracy of predicted box localization further optimizes convergence speed and localization accuracy. Experimental results show that the improved model achieved mAP@0.5 and mAP@0.5:0.95 scores of 98.9% and 89.4%, respectively, representing improvements of 0.2% and 1.8% over the baseline model. Additionally, the detection speed reached 108.7 FPS, an improvement of 23.2 FPS over the baseline model. The model effectively identifies concealed suspicious objects within packages, offering a novel approach for detection in public places. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

12. Special issue on intelligent systems: ISMIS 2022 selected papers.

Author: Ceci, Michelangelo, Flesca, Sergio, Manco, Giuseppe, and Masciari, Elio
Subjects: MACHINE learning, ARTIFICIAL intelligence, DECISION support systems, KNOWLEDGE representation (Information theory), COMPUTER vision, DEEP learning
Abstract: This document is a special issue of the Journal of Intelligent Information Systems, focusing on the selected papers from the International Symposium on Methodologies for Intelligent Systems (ISMIS 2022). The symposium, held in Cosenza, Italy, showcased research on various topics related to artificial intelligence, including decision support, knowledge representation, machine learning, computer vision, and more. The special issue includes eleven papers that have undergone rigorous peer-reviewing and cover a wide range of research topics, such as deep learning, anomaly detection, malware detection, sentiment classification, and healthcare professionals' burnout. The authors express their gratitude to the contributors and reviewers for their valuable contributions. [Extracted from the article]
Published: 2024
Full Text: View/download PDF

13. Short-term train arrival delay prediction: a data-driven approach

Author: Fu, Qingyun, Ding, Shuxin, Zhang, Tao, Wang, Rongsheng, Hu, Ping, and Pu, Cunlai
Published: 2024
Full Text: View/download PDF

14. SiLK-SLAM: accurate, robust and versatile visual SLAM with simple learned keypoints

Author: Yao, Jianjun and Li, Yingzhao
Published: 2024
Full Text: View/download PDF

15. Research Trends in Artificial Intelligence and Security—Bibliometric Analysis.

Author: Ilić, Luka, Šijan, Aleksandar, Predić, Bratislav, Viduka, Dejan, and Karabašević, Darjan
Subjects: DEEP learning, BIBLIOMETRICS, ARTIFICIAL intelligence, WEB analytics, MACHINE learning, PUBLIC health infrastructure
Abstract: This paper provides a bibliometric analysis of current research trends in the field of artificial intelligence (AI), focusing on key topics such as deep learning, machine learning, and security in AI. Through the lens of bibliometric analysis, we explore publications published from 2020 to 2024, using primary data from the Clarivate Analytics Web of Science Core Collection. The analysis includes the distribution of studies by year, the number of studies and citation rankings in journals, and the identification of leading countries, institutions, and authors in the field of AI research. Additionally, we investigate the distribution of studies by Web of Science categories, authors, affiliations, publication years, countries/regions, publishers, research areas, and citations per year. Key findings indicate a continued growth of interest in topics such as deep learning, machine learning, and security in AI over the past few years. We also identify leading countries and institutions active in researching this area. Awareness of data security is essential for the responsible application of AI technologies. Robust security frameworks are important to mitigate risks associated with AI integration into critical infrastructure such as healthcare and finance. Ensuring the integrity and confidentiality of data managed by AI systems is not only a technical challenge but also a societal necessity, demanding interdisciplinary collaboration and policy development. This analysis provides a deeper understanding of the current state of research in the field of AI and identifies key areas for further research and innovation. Furthermore, these findings may be valuable to practitioners and decision-makers seeking to understand current trends and innovations in AI to enhance their business processes and practices. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

16. Blockchain-based deep learning in IoT, healthcare and cryptocurrency price prediction: a comprehensive review

Author: Arora, Shefali, Mittal, Ruchi, Shrivastava, Avinash K., and Bali, Shivani
Published: 2024
Full Text: View/download PDF

17. Research and design of an expert diagnosis system for rail vehicle driven by data mechanism models

Author: Li, Lin, Wang, Jiushan, and Xiao, Shilu
Published: 2024
Full Text: View/download PDF

18. Fault detection system for paper cup machine based on real-time image processing.

Author: Aydın, Alaaddin and Güney, Selda
Subjects: *PROGRAMMABLE controllers, *OBJECT recognition (Computer vision), *SERVOMECHANISMS, *ARTIFICIAL intelligence, *DEEP learning, *DIGITAL image processing, *PRODUCT image, *IMAGE processing
Abstract: In the production of paper cups in industrial factories, it is tried to print high quality cups with less waste loss with the help of sensors and heating resistances mounted on the paper cup machine. In this study, a system that detects faulty products based on image processing and removes it by controlling the machine with servo motors, asynchronous motors and programmable logic controller (PLC) is designed. For fault product detection, classification has been performed using real-time Haarcascade algorithm and You Only Look Once (YOLO) algorithm which is a deep learning methods, and real-time object detection has been carried out using the OpenCv library. With this study, an effective faulty product detection and removing hardware system was realized by adapting artificial intelligence algorithms to a machine used in industry. Based on the results, a whole system can be applied to systems that involve removing a faulty product from a band in any production, packaging etc. facility is proposed. A hardware consisting of servo motors, asynchronous motors and PLC was designed to separate faulty cups from the existing paper cup production machine in this study. Then, a data set composed of 1068 images was created with images taken from the camera for faulty and faultless paper cups. Using this dataset, the effect of different deep learning methods on performance in the real-time system has been examined and successful results have been obtained. The optimal outcome was achieved, yielding a real-time application accuracy rate of 90.8% through the utilization of the Yolov5x architecture. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

19. Joint torque prediction of industrial robots based on PSO-LSTM deep learning

Author: Xiao, Wei, Fu, Zhongtao, Wang, Shixian, and Chen, Xubing
Published: 2024
Full Text: View/download PDF

20. Unveiling just-in-time decision support system using social media analytics: a case study on reverse logistics resource recycling

Author: Shahidzadeh, Mohammad Hossein and Shokouhyar, Sajjad
Published: 2024
Full Text: View/download PDF

21. Differential effects of visual complexity in firm-generated content on consumer engagements: a deep learning approach

Author: Wang, Feng, Yue, Mingyue, Yuan, Quan, and Cao, Rong
Published: 2024
Full Text: View/download PDF

22. A single-frame infrared small target detection method based on joint feature guidance.

Author: Xu, Xiaoyu, Zhan, Weida, Jiang, Yichun, Zhu, Depeng, Chen, Yu, Guo, Jinxin, Li, Jin, and Liu, Yanyan
Subjects: FEATURE extraction, DEEP learning, PROBLEM solving
Abstract: Single-frame infrared small target detection is affected by the low image resolution and small target size, and is prone to the problems of small target feature loss and positional offset during continuous downsampling; at the same time, the sparse features of the small targets do not correlate well with the global-local linkage of the background features. To solve the above problems, this paper proposes an efficient infrared small target detection method. First, this paper incorporates BlurPool in the feature extraction part, which reduces the loss and positional offset of small target features in the process of convolution and pooling. Second, this paper designs an interactive attention deep feature fusion module, which acquires the correlation information between the target and the background from a global perspective, and designs a compression mechanism based on deep a priori knowledge, which reduces the computational difficulty of the self-attention mechanism. Then, this paper designs the context local feature enhancement and fusion module, which uses deep semantic features to dynamically guide shallow local features to realize enhancement and fusion. Finally, this paper proposes an edge feature extraction module for shallow features, which utilizes the complete texture and location information in the shallow features to assist the network to initially locate the target position and edge shape. Numerous experiments show that the method in this paper significantly improves nIoU, F1-Measure and AUC on IRSTD-1k Datasets and NUAA-SIRST Datasets. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

23. A Meta-Survey on Intelligent Energy-Efficient Buildings.

Author: Islam, Md Babul, Guerrieri, Antonio, Gravina, Raffaele, and Fortino, Giancarlo
Subjects: MACHINE learning, REINFORCEMENT learning, SMART cities, DEEP learning, INDUSTRIAL ecology, INTELLIGENT buildings
Abstract: The rise of the Internet of Things (IoT) has enabled the development of smart cities, intelligent buildings, and advanced industrial ecosystems. When the IoT is matched with machine learning (ML), the advantages of the resulting enhanced environments can span, for example, from energy optimization to security improvement and comfort enhancement. Together, IoT and ML technologies are widely used in smart buildings, in particular, to reduce energy consumption and create Intelligent Energy-Efficient Buildings (IEEBs). In IEEBs, ML models are typically used to analyze and predict various factors such as temperature, humidity, light, occupancy, and human behavior with the aim of optimizing building systems. In the literature, many review papers have been presented so far in the field of IEEBs. Such papers mostly focus on specific subfields of ML or on a limited number of papers. This paper presents a systematic meta-survey, i.e., a review of review articles, that compares the state of the art in the field of IEEBs using the Prisma approach. In more detail, our meta-survey aims to give a broader view, with respect to the already published surveys, of the state-of-the-art in the IEEB field, investigating the use of supervised, unsupervised, semi-supervised, and self-supervised models in a variety of IEEB-based scenarios. Moreover, our paper aims to compare the already published surveys by answering five important research questions about IEEB definitions, architectures, methods/models used, datasets and real implementations utilized, and main challenges/research directions defined. This meta-survey provides insights that are useful both for newcomers to the field and for researchers who want to learn more about the methodologies and technologies used for IEEBs' design and implementation. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

24. A novel virtual-communicated evolution learning recommendation

Author: Chen, Yi-Cheng and Chen, Yen-Liang
Published: 2024
Full Text: View/download PDF

25. A Hybrid Digital Twin Scheme for the Condition Monitoring of Industrial Collaborative Robots.

Author: Ayankoso, Samuel, Kaigom, Eric, Louadah, Hassna, Faham, Hamidreza, Gu, Fengshou, and Ball, Andrew
Subjects: DIGITAL twins, DEEP learning, INDUSTRIAL robots, INDUSTRIAL safety, ELECTRONIC paper, RELIABILITY in engineering, DYNAMIC models
Abstract: Industrial collaborative robots play an essential role in smart manufacturing because they improve productivity while also ensuring workplace safety. However, the development of prognostic and health management systems to ensure the reliability of these robots has been a major challenge due to the lack of fault data. This paper proposed a digital twin scheme based on the fusion of the robot kinematic and dynamic models' information down to the powertrains (i.e., the joints motor, and gear) along with the control algorithms and uncertainty accommodation based upon deep learning. The presented digital twin concept has the potential to propel simulation-based fault prediction. We also highlight and discuss challenges and opportunities around the development of the hybrid digital twin for condition monitoring of industrial collaborative robots. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

26. Intelligent Noncontact Structural Displacement Detection Method Based on Computer Vision and Deep Learning.

Author: Liu, Hongbo, Zhang, Fan, Ma, Rui, Wang, Longxuan, Chen, Zhihua, Zhang, Qian, and Guo, Liulu
Subjects: DISPLACEMENT (Psychology), COMPUTER vision, EUCLIDEAN distance, COMPUTER simulation, DEEP learning, BAMBOO
Abstract: Accurate identification of structural displacements is important for structural state assessment and performance evaluation. This paper proposes a real-time structural displacement detection model based on computer vision and deep learning. The model consists of three stages: identification, tracking, and displacement resolution. First, the displacement target is identified and tracked by the improved YOLO v7 algorithm and the improved DeepSORT algorithm. Then, the Euclidean distance method based on inverse perspective mapping (IPM-ED) is proposed for the analytical conversion of the displacement. Next, the accuracy and effectiveness of this displacement detection model are evaluated through four groups of bamboo axial compression tests. A comparative analysis is conducted between the IPM-ED displacement analysis method and the commonly used ED displacement analysis method. Finally, the robustness of this method is tested by using a cable breakage test of a cable dome structure as an application case. The research results demonstrate that the maximum average error of the four groups of bamboo displacement tests is only 3.10 mm, and the maximum relative error of peak displacement is only 6.54 mm. The RMSE basically stays around 3.5 mm. The maximum displacement error in the application case is only 4.91 mm, with a maximum MAPE of 4.94%. In addition, the error percentage under the IPM-ED algorithm is basically within 5%, while the error percentage of the ED algorithm is more than 10%. The method in this paper achieves efficient and intelligent identification of structural displacements in a non-contact manner. The proposed method is suitable for environments where the contact displacement sensor is easily affected by vibration, the site is complex and requires additional displacement sensor fixing equipment, the displacement sensor with super-high structure is unsafe to deploy, and the contact displacement sensor in narrow space is inconvenient to deploy, so it has broad application prospects. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

27. Pantograph Slider Detection Architecture and Solution Based on Deep Learning.

Author: Guo, Qichang, Tang, Anjie, and Yuan, Jiabin
Subjects: IMAGE processing, TECHNICAL specifications, COMPUTER vision, DEEP learning, PANTOGRAPH
Abstract: Railway transportation has been integrated into people's lives. According to the "Notice on the release of the General Technical Specification of High-speed Railway Power Supply Safety Testing (6C System) System" issued by the National Railway Administration of China in 2012, it is required to install pantograph and slide monitoring devices in high-speed railway stations, station throats and the inlet and exit lines of high-speed railway sections, and it is required to detect the damage of the slider with high precision. It can be seen that the good condition of the pantograph slider is very important for the normal operation of the railway system. As a part of providing power for high-speed rail and subway, the pantograph must be paid attention to in railway transportation to ensure its integrity. The wear of the pantograph is mainly due to the contact power supply between the slide block and the long wire during high-speed operation, which inevitably produces scratches, resulting in depressions on the upper surface of the pantograph slide block. During long-term use, because the depression is too deep, there is a risk of fracture. Therefore, it is necessary to monitor the slider regularly and replace the slider with serious wear. At present, most of the traditional methods use automation technology or simple computer vision technology for detection, which is inefficient. Therefore, this paper introduces computer vision and deep learning technology into pantograph slide wear detection. Specifically, this paper mainly studies the wear detection of the pantograph slider based on deep learning and the main purpose is to improve the detection accuracy and improve the effect of segmentation. From a methodological perspective, this paper employs a linear array camera to enhance the quality of the data sets. Additionally, it integrates an attention mechanism to improve segmentation performance. Furthermore, this study introduces a novel image stitching method to address issues related to incomplete images, thereby providing a comprehensive solution. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

28. Comprehensive Review: Machine and Deep Learning in Brain Stroke Diagnosis.

Author: Fernandes, João N. D., Cardoso, Vitor E. M., Comesaña-Campos, Alberto, and Pinheira, Alberto
Subjects: DEEP learning, STROKE, MACHINE learning, ELECTRONIC data processing, DIAGNOSIS, PATIENT monitoring
Abstract: Brain stroke, or a cerebrovascular accident, is a devastating medical condition that disrupts the blood supply to the brain, depriving it of oxygen and nutrients. Each year, according to the World Health Organization, 15 million people worldwide experience a stroke. This results in approximately 5 million deaths and another 5 million individuals suffering permanent disabilities. The complex interplay of various risk factors highlights the urgent need for sophisticated analytical methods to more accurately predict stroke risks and manage their outcomes. Machine learning and deep learning technologies offer promising solutions by analyzing extensive datasets including patient demographics, health records, and lifestyle choices to uncover patterns and predictors not easily discernible by humans. These technologies enable advanced data processing, analysis, and fusion techniques for a comprehensive health assessment. We conducted a comprehensive review of 25 review papers published between 2020 and 2024 on machine learning and deep learning applications in brain stroke diagnosis, focusing on classification, segmentation, and object detection. Furthermore, all these reviews explore the performance evaluation and validation of advanced sensor systems in these areas, enhancing predictive health monitoring and personalized care recommendations. Moreover, we also provide a collection of the most relevant datasets used in brain stroke analysis. The selection of the papers was conducted according to PRISMA guidelines. Furthermore, this review critically examines each domain, identifies current challenges, and proposes future research directions, emphasizing the potential of AI methods in transforming health monitoring and patient care. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

29. 3D reconstruction method of welding area by fusion of coding raster and semantic segmentation network.

Author: Song, Limei, Xu, Baolin, Yang, Yangang, Yuan, Jiaxing, and Ye, Chenchao
Abstract: Due to its rapid projection and capture of fringe patterns, coded structured light measurement technology is widely utilised for acquiring three-dimensional information in welding areas. To achieve real-time segmentation of the welding area for T-joint butt welding workpieces with various noise interference, this paper has developed a lightweight dual-resolution semantic segmentation network (LDRNet). This paper designed a lightweight feature extraction module that provides efficient feature representations for the network using fewer parameters and computational costs. To enhance the network’s robustness to complex environmental noise, this paper introduced a multi-scale adaptive feature extraction module that can capture information at different scales of the environment. To further improve segmentation accuracy, this paper reconstructed the traditional pyramid pooling module and combined it with the CBAM attention mechanism to enhance the focus on important features. This paper proposes a local feature constraint method to improve the phase matching accuracy. Experimental results show that this paper algorithm significantly reduces Params and FLOPs by 62.02% and 65.78%, respectively, compared to the original network DDRNet-23-Slim. Moreover, it leads to an improvement in the IOU of the welding area from 74.55% to 77.43%. Additionally, this paper’s proposed algorithm effectively reduces the generation of noise points through phase matching by approximately 93.88%. Consequently, this algorithm satisfactorily meets the requirements of practical production processes. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

30. INTRODUCTION TO THE SPECIAL ISSUE ON NEXT GENERATION PERVASIVE RECONFIGURABLE COMPUTING FOR HIGH PERFORMANCE REAL TIME APPLICATIONS.

Author: VENKATESAN, C., YU-DONG ZHANG, CHOW CHEE ONN, and AND YONG SHI
Subjects: MACHINE learning, REINFORCEMENT learning, HIGH performance computing, COMPUTER vision, ARTIFICIAL intelligence, PARSING (Computer grammar), DEEP learning
Abstract: This document introduces a special issue of the journal "Scalable Computing: Practice & Experience" focused on next-generation pervasive reconfigurable computing for high-performance real-time applications. The authors discuss the importance of adaptable platforms for real-time tasks and highlight the benefits of reconfigurable computing in accelerating applications like image processing and machine learning. The special issue aims to explore recent advancements in this field and includes research papers on topics such as network security, malware detection, software reliability prediction, and optimization algorithms for wing design. The papers cover a range of computer science and technology topics, showcasing advancements and their potential impact on various computing domains. [Extracted from the article]
Published: 2024
Full Text: View/download PDF

31. FCN-Attention: A deep learning UWB NLOS/LOS classification algorithm using fully convolution neural network with self-attention mechanism.

Author: Pei, Yu, Chen, Ruizhi, Li, Deren, Xiao, Xiongwu, and Zheng, Xingyu
Subjects: CONVOLUTIONAL neural networks, CLASSIFICATION algorithms, FEATURE extraction, IMPULSE response, LOCATION-based services, DEEP learning
Abstract: The Ultra-Wideband (UWB) Location-Based Service is receiving more and more attention due to its high ranging accuracy and good time resolution. However, the None-Line-of-Sight (NLOS) propagation may reduce the ranging accuracy for UWB localization system in indoor environment. So it is important to identify LOS and NLOS propagations before taking proper measures to improve the UWB localization accuracy. In this paper, a deep learning-based UWB NLOS/LOS classification algorithm called FCN-Attention is proposed. The proposed FCN-Attention algorithm utilizes a Fully Convolution Network (FCN) for improving feature extraction ability and a self-attention mechanism for enhancing feature description from the data to improve the classification accuracy. The proposed algorithm is evaluated using an open-source dataset, a local collected dataset and a mixed dataset created from these two datasets. The experiment result shows that the proposed FCN-Attention algorithm achieves classification accuracy of 88.24% on the open-source dataset, 100% on the local collected dataset and 92.01% on the mixed dataset, which is better than the results from other evaluated NLOS/LOS classification algorithms in most scenarios in this paper. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

32. Machine Learning and Graph Signal Processing Applied to Healthcare: A Review.

Author: Calazans, Maria Alice Andrade, Ferreira, Felipe A. B. S., Santos, Fernando A. N., Madeiro, Francisco, and Lima, Juliano B.
Subjects: PATTERN recognition systems, SIGNAL processing, DEEP learning, GRAPH theory, SIGNALS & signaling
Abstract: Signal processing is a very useful field of study in the interpretation of signals in many everyday applications. In the case of applications with time-varying signals, one possibility is to consider them as graphs, so graph theory arises, which extends classical methods to the non-Euclidean domain. In addition, machine learning techniques have been widely used in pattern recognition activities in a wide variety of tasks, including health sciences. The objective of this work is to identify and analyze the papers in the literature that address the use of machine learning applied to graph signal processing in health sciences. A search was performed in four databases (Science Direct, IEEE Xplore, ACM, and MDPI), using search strings to identify papers that are in the scope of this review. Finally, 45 papers were included in the analysis, the first being published in 2015, which indicates an emerging area. Among the gaps found, we can mention the need for better clinical interpretability of the results obtained in the papers, that is not to restrict the results or conclusions simply to performance metrics. In addition, a possible research direction is the use of new transforms. It is also important to make new public datasets available that can be used to train the models. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

33. Comment on Martínez-Delgado et al. Using Absorption Models for Insulin and Carbohydrates and Deep Leaning to Improve Glucose Level Predictions. Sensors 2021, 21 , 5273.

Author: Misplon, Josiah Z. R., Saini, Varun, Sloves, Brianna P., Meerts, Sarah H., and Musicant, David R.
Subjects: INSULIN, CARBOHYDRATES, TYPE 1 diabetes, GLUCOSE, MACHINE learning, ABSORPTION
Abstract: The paper "Using Absorption Models for Insulin and Carbohydrates and Deep Leaning to Improve Glucose Level Predictions" (Sensors 2021, 21, 5273) proposes a novel approach to predicting blood glucose levels for people with type 1 diabetes mellitus (T1DM). By building exponential models from raw carbohydrate and insulin data to simulate the absorption in the body, the authors reported a reduction in their model's root-mean-square error (RMSE) from 15.5 mg/dL (raw) to 9.2 mg/dL (exponential) when predicting blood glucose levels one hour into the future. In this comment, we demonstrate that the experimental techniques used in that paper are flawed, which invalidates its results and conclusions. Specifically, after reviewing the authors' code, we found that the model validation scheme was malformed, namely, the training and test data from the same time intervals were mixed. This means that the reported RMSE numbers in the referenced paper did not accurately measure the predictive capabilities of the approaches that were presented. We repaired the measurement technique by appropriately isolating the training and test data, and we discovered that their models actually performed dramatically worse than was reported in the paper. In fact, the models presented in the that paper do not appear to perform any better than a naive model that predicts future glucose levels to be the same as the current ones. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

34. Learning degradation-aware visual prompt for maritime image restoration under adverse weather conditions.

Author: Xin He, Tong Jia, and Junjie Li
Subjects: IMAGE reconstruction, VISUAL learning, ARTIFICIAL intelligence, RESCUE work, RAINFALL
Abstract: Adverse weather conditions such as rain and haze often lead to a degradation in the quality of maritime images, which is crucial for activities like navigation, fishing, and search and rescue. Therefore, it is of great interest to develop an effective algorithm to recover high-quality maritime images under adverse weather conditions. This paper proposes a prompt-based learning method with degradation perception for maritime image restoration, which contains two key components: a restoration module and a prompting module. The former is employed for image restoration, whereas the latter encodes weather-related degradation-specific information to modulate the restoration module, enhancing the recovery process for improved results. Inspired by the recent trend of prompt learning in artificial intelligence, this paper adopts soft-prompt technology to generate learnable visual prompt parameters for better perceiving the degradation-conditioned cues. Extensive experimental results on several benchmarks show that our approach achieves superior restoration performance in maritime image dehazing and deraining tasks. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

35. Short‐term photovoltaic prediction based on CNN‐GRU optimized by improved similar day extraction, decomposition noise reduction and SSA optimization.

Author: Li, Rui, Wang, Mingtao, Li, Xingyu, Qu, Jian, and Dong, Yuhan
Subjects: NOISE control, DEEP learning, WAVELET transforms, SEARCH algorithms
Abstract: The accuracy of short‐term photovoltaic (PV) power prediction is crucial for maintaining power system stability and grid scheduling. Here, a short‐term PV power prediction framework is proposed considering combined weather similarity day screening, signal decomposition noise reduction and hybrid deep learning to realize PV power prediction. First, a combined meteorological similar day screening model is constructed to screen out historical days similar to the day, which reduces the number of training sets; Second, Synchrosqueezing Wavelet Transform is utilized to eliminate data noise. Third, a Convolution Neural Network‐Gate Recurrent Unit (CNN‐GRU) network is constructed to extract periodic and nonlinear features in the PV power generation data series and to capture the relationship features between PV power generation and meteorological factors to improve the prediction accuracy. Fourth, the Sparrow Search Algorithm is introduced to perform hyper‐parameter optimization of the CNN‐GRU network to accelerate the model convergence and improve the model training efficiency. Finally, this paper conducts simulation experiments and the experimental results show that the prediction method proposed in this paper can effectively improve the prediction accuracy of short‐term PV power compared to the baseline model, and the method proposed in this paper is superior to other conventional methods. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

36. Advanced Machine Learning and Deep Learning Approaches for Remote Sensing II.

Author: Jeon, Gwanggil
Subjects: REMOTE sensing, MACHINE learning, ARTIFICIAL neural networks, DEEP learning, ARTIFICIAL intelligence, DISTANCE education
Abstract: This document is a summary of a special issue on advanced machine learning and deep learning techniques for remote sensing. The issue includes 16 research papers that cover a range of topics, including hyperspectral image classification, moving point target detection, radar echo extrapolation, and remote sensing object detection. Each paper introduces a novel approach or model and provides extensive testing and evaluation to demonstrate its effectiveness. The insights shared in this special issue are expected to contribute to future advancements in artificial intelligence-based remote sensing research. [Extracted from the article]
Published: 2024
Full Text: View/download PDF

37. A semantic segmentation model for road cracks combining channel-space convolution and frequency feature aggregation.

Author: Zhang, Mingxing and Xu, Jian
Subjects: TRAFFIC safety, IMAGE segmentation, PROBLEM solving
Abstract: In transportation, roads sometimes have cracks due to overloading and other reasons, which seriously affect driving safety, and it is crucial to identify and fill road cracks in time. Aiming at the defects of existing semantic segmentation models that have degraded the segmentation performance of road crack images and the standard convolution makes it challenging to capture the spatial and channel coupling relationship between pixels. It is difficult to differentiate crack pixels from background pixels in complex backgrounds; this paper proposes a semantic segmentation model for road cracks that combines channel-spatial convolution with the aggregation of frequency features. A new convolutional block is proposed to accurately identify cracked pixels by grouping spatial displacements and convolutional kernel weight dynamization while modeling pixel spatial relationships linked to channel features. To enhance the contrast of crack edges, a frequency domain feature aggregation module is proposed, which uses a simple windowing strategy to solve the problem of mismatch of frequency domain inputs and, at the same time, takes into account the effect of the frequency imaginary part on the features to model the deep frequency features effectively. Finally, a feature refinement module is designed to refine the semantic features to improve the segmentation accuracy. Many experiments have proved that the model proposed in this paper has better performance and more application potential than the current popular general model. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

38. How to Make a State of the Art Report—Case Study—Image-Based Road Crack Detection: A Scientometric Literature Review.

Author: Fan, Luxin, Tang, SaiHong, Mohd Ariffin, Mohd Khairol Anuar b., Ismail, Mohd Idris Shah b., and Zhao, Ruixin
Subjects: LITERATURE reviews, BIBLIOMETRICS, DEEP learning, TEXT mining, RESEARCH personnel
Abstract: With the rapid growth in urban construction in Malaysia, road breakage has challenged traditional manual inspection methods. In order to quickly and accurately detect the extent of road breakage, it is crucial to apply automated road crack detection techniques. Researchers have long studied image-based road crack detection techniques, especially the deep learning methods that have emerged in recent years, leading to breakthrough developments in the field. However, many issues remain in road crack detection methods using deep learning techniques. The field lacks state-of-the-art systematic reviews that can scientifically and effectively analyze existing works, document research trends, summarize outstanding research results, and identify remaining shortcomings. To conduct a systematic review of the relevant literature, a bibliometric analysis and a critical analysis of the papers published in the field were performed. VOSviewer and CiteSpace text mining tools were used to analyze and visualize the bibliometric analysis of some parameters derived from the articles. The history and current status of research in the field by authors from all over the world are elucidated and future trends are analyzed. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

39. Comprehensive Analysis of Temporal–Spatial Fusion from 1991 to 2023 Using Bibliometric Tools.

Author: Cui, Jiawei, Li, Juan, Gu, Xingfa, Zhang, Wenhao, Wang, Dong, Sun, Xiuling, Zhan, Yulin, Yang, Jian, Liu, Yan, and Yang, Xiufeng
Subjects: SCIENTIFIC literature, SURFACE dynamics, BIBLIOMETRICS, MULTISENSOR data fusion, DEEP learning, IMAGE fusion, REMOTE sensing
Abstract: Due to budget and sensor technology constraints, a single sensor cannot simultaneously provide observational images with both a high spatial and temporal resolution. To solve the above problem, the spatiotemporal fusion (STF) method was proposed and proved to be an indispensable tool for monitoring land surface dynamics. There are relatively few systematic reviews of the STF method. Bibliometrics is a valuable method for analyzing the scientific literature, but it has not yet been applied to the comprehensive analysis of the STF method. Therefore, in this paper, we use bibliometrics and scientific mapping to analyze the 2967 citation data from the Web of Science from 1991 to 2023 in a metrological manner, covering the themes of STF, data fusion, multi-temporal analysis, and spatial analysis. The results of the literature analysis reveal that the number of articles displays a slow to rapid increase during the study period, but decreases significantly in 2023. Research institutions in China (1059 papers) and the United States (432 papers) are the top two contributors in the field. The keywords "Sentinel", "deep learning" (DL), and "LSTM" (Long Short-Term Memory) appeared most frequently in the past three years. In the future, remote sensing spatiotemporal fusion research can address more of the limitations of heterogeneous landscapes and climatic conditions to improve fused images' accuracy. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

40. A Multimodal Fusion Behaviors Estimation Method for Public Dangerous Monitoring.

Author: Hou, Renkai, Xu, Xiangyang, Dai, Yaping, Shao, Shuai, and Hirota, Kaoru
Subjects: DEEP learning, PUBLIC spaces, AUTOMATIC identification, MANUAL labor, EMOTION recognition, EMOTIONS
Abstract: At the present stage, the identification of dangerous behaviors in public places mostly relies on manual work, which is subjective and has low identification efficiency. This paper proposes an automatic identification method for dangerous behaviors in public places, which analyzes group behavior and speech emotion through deep learning network and then performs multimodal information fusion. Based on the fusion results, people can judge the emotional atmosphere of the crowd, make early warning, and alarm for possible dangerous behaviors. Experiments show that the algorithm adopted in this paper can accurately identify dangerous behaviors and has great application value. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

41. Cloud-Edge Collaborative Defect Detection Based on Efficient Yolo Networks and Incremental Learning.

Author: Lei, Zhenwu, Zhang, Yue, Wang, Jing, and Zhou, Meng
Subjects: MACHINE learning, MANUFACTURING defects, FEATURE extraction, ELECTRONICS manufacturing, MANUFACTURING processes, DEEP learning
Abstract: Defect detection constitutes one of the most crucial processes in industrial production. With a continuous increase in the number of defect categories and samples, the defect detection model underpinned by deep learning finds it challenging to expand to new categories, and the accuracy and real-time performance of product defect detection are also confronted with severe challenges. This paper addresses the problem of insufficient detection accuracy of existing lightweight models on resource-constrained edge devices by presenting a new lightweight YoloV5 model, which integrates four modules, SCDown, GhostConv, RepNCSPELAN4, and ScalSeq. Here, this paper abbreviates it as SGRS-YoloV5n. Through the incorporation of these modules, the model notably enhances feature extraction and computational efficiency while reducing the model size and computational load, making it more conducive for deployment on edge devices. Furthermore, a cloud-edge collaborative defect detection system is constructed to improve detection accuracy and efficiency through initial detection by edge devices, followed by additional inspection by cloud servers. An incremental learning mechanism is also introduced, enabling the model to adapt promptly to new defect categories and update its parameters accordingly. Experimental results reveal that the SGRS-YoloV5n model exhibits superior detection accuracy and real-time performance, validating its value and stability for deployment in resource-constrained environments. This system presents a novel solution for achieving efficient and accurate real-time defect detection. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

42. Research on Applying Deep Learning to Visual–Motor Integration Assessment Systems in Pediatric Rehabilitation Medicine.

Author: Tsai, Yu-Ting, Lee, Jin-Shyan, and Huang, Chien-Yu
Abstract: In pediatric rehabilitation medicine, manual assessment methods for visual–motor integration result in inconsistent scoring standards. To address these issues, incorporating artificial intelligence (AI) technology is a feasible approach that can reduce time and improve accuracy. Existing research on visual–motor integration scoring has proposed a framework based on convolutional neural networks (CNNs) for the Beery–Buktenica developmental test of visual–motor integration. However, as the number of training questions increases, the accuracy of this framework significantly decreases. This paper proposes a new architecture to reduce the number of features, channels, and overall model complexity. The architectureoptimizes input features by concatenating question numbers with answer features and selecting appropriate channel ratios and optimizes the output vector by designing the task as a multi-class classification. This paper also proposes a model named improved DenseNet. After experimentation, DenseNet201 was identified as the most suitable pre-trained model for this task and was used as the backbone architecture for improved DenseNet. Additionally, new fully connected layers were added for feature extraction and classification, allowing for specialized feature learning. The architecture can provide reasons for unscored results based on prediction results and decoding rules, offering directions for children's training. The final experimental results show that the proposed new architecture improves the accuracy of scoring 6 question graphics by 12.8% and 12 question graphics by 20.14% compared to the most relevant literature. The accuracy of the proposed new architecture surpasses the model frameworks of the most relevant literature, demonstrating the effectiveness of this approach in improving scoring accuracy and stability. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

43. Novel Approach to Protect Red Revolutionary Heritage Based on Artificial Intelligence Algorithm and Image-Processing Technology.

Author: Yi, Junbo, Tian, Yan, and Zhao, Yuanfei
Subjects: GENERATIVE adversarial networks, IMAGE reconstruction, ARTIFICIAL intelligence, PROTECTION of cultural property, DEEP learning
Abstract: The red revolutionary heritage is a valuable part of China's historical and cultural legacy, with the potential to generate economic benefits through its thoughtful development. However, challenges such as insufficient understanding, lack of comprehensive planning and layout, and limited protection and utilization methods hinder the full realization of the political, cultural, and economic value of red heritage. To address these problems, this paper thoroughly examines the current state of red revolutionary heritage protection and identifies the problems within the preservation process. Moreover, it proposes leveraging advanced artificial intelligence (AI) technology to repair some damaged image data. Specifically, this paper introduces a red revolutionary cultural relic image-restoration model based on a generative adversarial network (GAN). This model was trained using samples of damaged image and utilizes high-quality models to restore these images effectively. The study also integrates real-world revolutionary heritage images for practical application and assesses its effectiveness through questionnaire surveys. The survey results show that AI algorithms and image-processing technologies hold significant potential in the protection of revolutionary heritage. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

44. Decision Support Systems for Disease Detection and Diagnosis.

Author: Rizzi, Maria
Subjects: CLINICAL decision support systems, MACHINE learning, MEDICAL personnel, DECISION support systems, CONVOLUTIONAL neural networks, BREAST, DEEP learning
Abstract: This document discusses the recent advancements in decision support systems (DSSs) for disease detection and diagnosis. The combination of biomedical studies and information technology has led to the development of accessible and accurate solutions that can improve patient survival rates. The document highlights several research papers that cover a wide range of topics, including multiple sclerosis detection, neurodegenerative disease detection, breast lesion classification, COVID-19 mortality prediction, melanoma diagnosis, and prediction of second primary skin cancer. The adoption of efficient DSSs can aid clinical assessment, reduce misdiagnosis, and facilitate evidence-based decision-making. However, challenges such as validation, training, and user interface design need to be addressed for widespread application of DSSs in clinical practice. The document concludes by emphasizing the importance of future studies and developments in overcoming limitations and expanding the use of DSSs in different contexts. [Extracted from the article]
Published: 2024
Full Text: View/download PDF

45. Clinical named entity extraction for extracting information from medical data.

Author: Kuttaiyapillai, Dhanasekaran, Madasamy, Anand, Ayyavu, Shobanadevi, and Sayeed, Md Shohel
Subjects: CONVOLUTIONAL neural networks, DATA mining, DATA analytics, MACHINE learning, RESEARCH personnel, DEEP learning
Abstract: Clinical named entity extraction (NER) based on deep learning gained much attention among researchers and data analysts. This paper proposes a NER approach to extract valuable Parkinson’s disease-related information. To develop an effective NER method and to handle problems in disease data analytics, a unique NER technique applies a “recognize-map-extract (RME)” mechanism and aims to deal with complex relationships present in the data. Due to the fast-growing medical data, there is a challenge in the development of suitable deep-learning methods for NER. Furthermore, the traditional machine learning approaches rely on the time-consuming process of creating corpora and cannot extract information for specific needs and locations in certain situations. This paper presents a clinical NER approach based on a convolutional neural network (CNN) for better use of specific features around medical entities and analyzes the performance of the proposed approach through fine-tuning NER with effective pre-training on the BC5CDR dataset. The proposed method uses annotation of entities for various medical concepts. The second stage develops a clinically NER method. This proposed method shows interesting results on the performance measures achieving a precision of 92.57%, recall of 92.22%, and F1- measure of 91.6% [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

46. A lightweight dual-attention network for tomato leaf disease identification.

Author: Enxu Zhang, Ning Zhang, Fei Li, and Cheng Lv
Subjects: MACHINE learning, COMPUTER vision, RICE diseases & pests, IMAGE recognition (Computer vision), PLANT diseases, DIGITAL-to-analog converters, DEEP learning
Abstract: Tomato disease image recognition plays a crucial role in agricultural production. Today, whilemachine vision methods based on deep learning have achieved some success indiseaserecognition, theystill faceseveralchallenges. These include issues such as imbalanced datasets, unclear disease features, small inter-class differences, and large intra-class variations. To address these challenges, this paper proposes a method for classifying and recognizing tomato leaf diseases based on machine vision. First, to enhance the disease feature details in images, a piecewise linear transformation method is used for image enhancement, and oversampling is employed to expand the dataset, compensating for the imbalanced dataset. Next, this paper introduces a convolutional block with a dual attention mechanismcalled DAC Block, which is used to construct a lightweight model named LDAMNet. The DAC Block innovatively uses Hybrid Channel Attention (HCA) and Coordinate Attention (CSA) to process channel information and spatial information of input images respectively, enhancing the model's feature extraction capabilities. Additionally, this paper proposes a Robust Cross-Entropy (RCE) loss function that is robust tonoisylabels, aimedat reducing the impact of noisy labels on the LDAM Net model during training. Experimental results show that this method achieves an average recognition accuracy of 98.71% on the tomato disease dataset, effectively retaining disease information in images and capturing disease areas. Furthermore, the method also demonstrates strong recognition capabilities on rice crop disease datasets, indicating good generalization performance and the ability to function effectively in disease recognition across different crops. The research findings of this paper provide new ideas and methods for the field of crop disease recognition. However, future research needs to further optimize the model's structure and computational efficiency, and validate its application effects in more practical scenarios. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

47. Multi-Person Action Recognition Based on Millimeter-Wave Radar Point Cloud.

Author: Dang, Xiaochao, Fan, Kai, Li, Fenfang, Tang, Yangyang, Gao, Yifei, and Wang, Yue
Subjects: DEEP learning, HUMAN-computer interaction, POINT cloud, LEARNING, CORPORATE bonds
Abstract: Featured Application: This research has important applications in areas such as smart furniture and human-computer interaction. It will bring people a more efficient and comfortable living experience as well as a new smart experience. Human action recognition has many application prospects in human-computer interactions, innovative furniture, healthcare, and other fields. The traditional human motion recognition methods have limitations in privacy protection, complex environments, and multi-person scenarios. Millimeter-wave radar has attracted attention due to its ultra-high resolution and all-weather operation. Many existing studies have discussed the application of millimeter-wave radar in single-person scenarios, but only some have addressed the problem of action recognition in multi-person scenarios. This paper uses a commercial millimeter-wave radar device for human action recognition in multi-person scenarios. In order to solve the problems of severe interference and complex target segmentation in multiplayer scenarios, we propose a filtering method based on millimeter-wave inter-frame differences to filter the collected human point cloud data. We then use the DBSCAN algorithm and the Hungarian algorithm to segment the target, and finally input the data into a neural network for classification. The classification accuracy of the system proposed in this paper reaches 92.2% in multi-person scenarios through experimental tests with the five actions we set. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

48. Analyzing Diabetes Detection and Classification: A Bibliometric Review (2000–2023).

Author: Ferdaus, Jannatul, Rochy, Esmay Azam, Biswas, Uzzal, Tiang, Jun Jiat, and Nahid, Abdullah-Al
Subjects: BIBLIOMETRICS, CITATION analysis, DIABETIC retinopathy, SCIENCE databases, WEB databases, DEEP learning
Abstract: Bibliometric analysis is a rigorous method to analyze significant quantities of bibliometric data to assess their impact on a particular field. This study used bibliometric analysis to investigate the academic research on diabetes detection and classification from 2000 to 2023. The PRISMA 2020 framework was followed to identify, filter, and select relevant papers. This study used the Web of Science database to determine relevant publications concerning diabetes detection and classification using the keywords "diabetes detection", "diabetes classification", and "diabetes detection and classification". A total of 863 publications were selected for analysis. The research applied two bibliometric techniques: performance analysis and science mapping. Various bibliometric parameters, including publication analysis, trend analysis, citation analysis, and networking analysis, were used to assess the performance of these articles. The analysis findings showed that India, China, and the United States are the top three countries with the highest number of publications and citations on diabetes detection and classification. The most frequently used keywords are machine learning, diabetic retinopathy, and deep learning. Additionally, the study identified "classification", "diagnosis", and "validation" as the prevailing topics for diabetes identification. This research contributes valuable insights into the academic landscape of diabetes detection and classification. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

49. Deep Learning Technology and Image Sensing.

Author: Lee, Suk-Ho and Kang, Dae-Ki
Subjects: CONVOLUTIONAL neural networks, OBJECT recognition (Computer vision), ACOUSTIC imaging, IMAGE recognition (Computer vision), ARTIFICIAL intelligence, DEEP learning, IMAGE enhancement (Imaging systems)
Abstract: This document discusses the advancements in deep learning technology and image sensing. It highlights the role of deep learning in improving image recognition data and sensor performance in various applications, such as autonomous driving and smartphone camera functionalities. The document presents eleven papers published in a special issue on deep learning technology and image sensing, covering topics like medical imaging, image enhancement, object detection, and innovation in image sensing technologies. The papers demonstrate innovative approaches and methodologies in diverse domains, including liver segmentation, brain tumor classification, optic cup and disc edge detection, image-to-image translation in astronomy, super-resolution imaging, and object detection systems. These studies contribute to the interdisciplinary field of deep learning-based imaging and sensing and offer valuable insights for various fields, including healthcare, astronomy, and robotics. [Extracted from the article]
Published: 2024
Full Text: View/download PDF

50. Optimization of news dissemination push mode by intelligent edge computing technology for deep learning.

Author: DeGe, JiLe and Sang, Sina
Subjects: DEEP reinforcement learning, PATTERN recognition systems, SOCIAL media, NEWS websites, RECOMMENDER systems, DEEP learning, REINFORCEMENT learning
Abstract: The Internet era is an era of information explosion. By 2022, the global Internet users have reached more than 4 billion, and the social media users have exceeded 3 billion. People face a lot of news content every day, and it is almost impossible to get interesting information by browsing all the news content. Under this background, personalized news recommendation technology has been widely used, but it still needs to be further optimized and improved. In order to better push the news content of interest to different readers, users' satisfaction with major news websites should be further improved. This study proposes a new recommendation algorithm based on deep learning and reinforcement learning. Firstly, the RL algorithm is introduced based on deep learning. Deep learning is excellent in processing large-scale data and complex pattern recognition, but it often faces the challenge of low sample efficiency when it comes to complex decision-making and sequential tasks. While reinforcement learning (RL) emphasizes learning optimization strategies through continuous trial and error through interactive learning with the environment. Compared with deep learning, RL is more suitable for scenes that need long-term decision-making and trial-and-error learning. By feeding back the reward signal of the action, the system can better adapt to the unknown environment and complex tasks, which makes up for the relative shortcomings of deep learning in these aspects. A scenario is applied to an action to solve the sequential decision problem in the news dissemination process. In order to enable the news recommendation system to consider the dynamic changes in users' interest in news content, the Deep Deterministic Policy Gradient algorithm is applied to the news recommendation scenario. Opposing learning complements and combines Deep Q-network with the strategic network. On the basis of fully summarizing and thinking, this paper puts forward the mode of intelligent news dissemination and push. The push process of news communication information based on edge computing technology is proposed. Finally, based on Area Under Curve a Q-Leaning Area Under Curve for RL models is proposed. This indicator can measure the strengths and weaknesses of RL models efficiently and facilitates comparing models and evaluating offline experiments. The results show that the DDPG algorithm improves the click-through rate by 2.586% compared with the conventional recommendation algorithm. It shows that the algorithm designed in this paper has more obvious advantages in accurate recommendation by users. This paper effectively improves the efficiency of news dissemination by optimizing the push mode of intelligent news dissemination. In addition, the paper also deeply studies the innovative application of intelligent edge technology in news communication, which brings new ideas and practices to promote the development of news communication methods. Optimizing the push mode of intelligent news dissemination not only improves the user experience, but also provides strong support for the application of intelligent edge technology in this field, which has important practical application prospects. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

12,771 results

Search Results

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources