280 results on '"Transformer Network"'
Search Results
2. Remaining useful life prediction for stratospheric airships based on a channel and temporal attention network
- Author
-
Luo, Yuzhao, Zhu, Ming, Chen, Tian, and Zheng, Zewei
- Published
- 2025
- Full Text
- View/download PDF
3. Deep probabilistic solar power forecasting with Transformer and Gaussian process approximation
- Author
-
Xiong, Binyu, Chen, Yuntian, Chen, Dali, Fu, Jun, and Zhang, Dongxiao
- Published
- 2025
- Full Text
- View/download PDF
4. Masked facial expression recognition based on temporal overlap module and action unit graph convolutional network
- Author
-
Zhang, Zheyuan, Liu, Bingtong, Zhou, Ju, Wang, Hanpu, Liu, Xinyu, Lin, Bing, and Chen, Tong
- Published
- 2025
- Full Text
- View/download PDF
5. Contextual visual and motion salient fusion framework for action recognition in dark environments
- Author
-
Munsif, Muhammad, Khan, Samee Ullah, Khan, Noman, Hussain, Altaf, Kim, Min Je, and Baik, Sung Wook
- Published
- 2024
- Full Text
- View/download PDF
6. Detecting severity of Diabetic Retinopathy from fundus images: A transformer network-based review
- Author
-
Karkera, Tejas, Adak, Chandranath, Chattopadhyay, Soumi, and Saqib, Muhammad
- Published
- 2024
- Full Text
- View/download PDF
7. RGB-T-UV Multi-modal Object Tracking Based on Transformer Network
- Author
-
Song, Qinghua, Wang, Xiaolei, Zhang, Yi, Hu, Jinping, Liu, Yu, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Bebis, George, editor, Patel, Vishal, editor, Gu, Jinwei, editor, Panetta, Julian, editor, Gingold, Yotam, editor, Johnsen, Kyle, editor, Arefin, Mohammed Safayet, editor, Dutta, Soumya, editor, and Biswas, Ayan, editor
- Published
- 2025
- Full Text
- View/download PDF
8. A Novel CNN-Transformer Capacity Estimation Model for Real-World Lithium-Ion Battery Pack
- Author
-
Soo, Yin-Yi, Wang, Yujie, Xiang, Haoxiang, Li, Gang, Series Editor, Filipe, Joaquim, Series Editor, Xu, Zhiwei, Series Editor, Li, Kang, editor, Liu, Kailong, editor, Hu, Yukun, editor, Tan, Mao, editor, Zhang, Long, editor, and Yang, Zhile, editor
- Published
- 2025
- Full Text
- View/download PDF
9. Identification of strong motion record baseline drift based on Bayesian-optimized Transformer network.
- Author
-
Zhou, Baofeng, Yin, Yue, Wang, Maofa, Zhang, Runjie, Zhang, Yue, and Guo, Wenheng
- Subjects
- *
CHI-chi Earthquake, Taiwan, 1999 , *TRANSFORMER models , *FEATURE extraction , *EMERGENCY management , *ARTIFICIAL intelligence , *NATURAL disaster warning systems - Abstract
Research in earthquake engineering heavily relies on strong motion observation. The quality of strong motion records directly affects the reliability of earthquake disaster prevention, rapid reporting of seismic magnitude, earthquake early warning, and other areas. Currently, basic mathematical methods, such as zero-line adjustment and filtering, are commonly employed to ensure the quality of strong motion records. However, these methods often rely on subjective judgment based on human experience when dealing with abnormal waveforms in strong motion records, leading to relatively low efficiency. To address this challenge, this paper proposes an innovative Transformer model based on Bayesian optimization to efficiently identify baseline drift anomalies in strong motion records. By partitioning the strong motion record data from the 1999 Chi-Chi earthquake in Taiwan, China, into two categories: high-quality records (with minimal baseline drift) and low-quality records (with significant baseline drift), we extracted data with distinct features and inputted them into the proposed model for training. Data with distinct features were extracted and input into the proposed model for training. Finally, the model was used to predict whether strong motion records exhibited baseline drift abnormalities. The experimental results show that the optimized Transformer model achieves a performance exceeding 85% in key evaluation metrics such as accuracy and F1 scores. It is capable of efficiently identifying a substantial volume of strong motion records with baseline drift within a short period of time. The model effectively performs the baseline drift classification task for strong motion records and can be used for subsequent identification of abnormalities after baseline drift correction, enabling automation in handling abnormal data related to baseline drift. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
10. Character-level inclusive transformer architecture for information gain in low resource code-mixed language.
- Author
-
Bhowmick, Rajat Subhra, Ganguli, Isha, and Sil, Jaya
- Subjects
- *
LOW-resource languages , *TRANSFORMER models , *PROGRAMMING languages , *DEEP learning , *LEARNING strategies - Abstract
The use of code-mixed languages in social media platforms is very common to communicate in an informal way and has immense importance in a multilingual society, like India. Implementing various NLP tasks on code-mixed language for machine comprehension and NLP applications is the need of the hour. The implementation of complex learning models is difficult due to the scarcity of available code-mixed resources. Designing more effective architectures to perform learning from low resource dataset along with transfer learning settings are the possible solutions. We propose an improvised transformer network (Character Inclusion Transformer) that utilizes and learns character-level information available in the words of code-mixed sentences. The proposed model improves the performance of the transformer model when trained from scratch using low resource code-mixed datasets. We also propose two more architecture settings, useful for transfer learning strategy using the mBERT pre-trained model. Three basic word-level tagging NLP tasks, i.e., NER, POS Tagging, and Language Identification (LID) are considered in the paper where Language Identification is specific to code-mixed language. Six separate datasets, namely IIITH NER, LID FIRE, LID ICON, LID UD, POS ICON, POS UD, have been tested, and results are reported using weighted and macro-average while evaluating precision, recall and F1 score [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
11. Enhanced bearing RUL prediction based on dynamic temporal attention and mixed MLP.
- Author
-
Jin, Zhongtian, Chen, Chong, Syntetos, Aris, and Liu, Ying
- Subjects
REMAINING useful life ,MACHINE learning ,ARTIFICIAL intelligence ,ROLLER bearings ,IMAGE processing ,DEEP learning - Abstract
Bearings are critical components in machinery, and accurately predicting their remaining useful life (RUL) is essential for effective predictive maintenance. Traditional RUL prediction methods often rely on manual feature extraction and expert knowledge, which face specific challenges such as handling non-stationary data and avoiding overfitting due to the inclusion of numerous irrelevant features. This paper presents an approach that leverages Continuous Wavelet Transform (CWT) for feature extraction, a Channel-Temporal Mixed MLP (CT-MLP) layer for capturing intricate dependencies, and a dynamic attention mechanism to adjust its focus based on the temporal importance of features within the time series. The dynamic attention mechanism integrates multi-head attention with innovative enhancements, making it particularly effective for datasets exhibiting non-stationary behaviour. An experimental study using the XJTU-SY rolling bearings dataset and the PRONOSTIA bearing dataset revealed that the proposed deep learning algorithm significantly outperforms other state-of-the-art algorithms in terms of RMSE and MAE, demonstrating its robustness and accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
12. Deep learning-based minute-scale digital prediction model for temperature induced deflection of a multi-tower double-layer steel truss bridge.
- Author
-
Meng, Lingxin, Sun, Bo, Dang, Yingjie, Shen, Lizhong, and Zhuang, Yizhou
- Subjects
- *
LONG short-term memory , *TRUSS bridges , *TRANSFORMER models , *PREDICTION models , *WAVELET transforms , *LONG-span bridges - Abstract
Bridge deflection serves as a vital and intuitive index for the evaluation of bridge safety. Temperature load has the greatest influence on the bridge deformation and studies on the temperature-induced deformation prediction of long-span bridge are in limited numbers. A digital prediction model based on deep learning in minute scale is established to study the bridge deflection caused by temperature. The wavelet transform (WT) is adopted to filter the high-frequency signals of the original deflection caused by the related load factors. Three different networks, long short-term memory (LSTM), bidirectional LSTM (Bi-LSTM), and Transformer variant, are studied and compared in the prediction process. Two different learning strategies considering different input data are also considered to optimize the prediction performance. The proposed prediction model is applied to the temperature induced deflection prediction of a multi-tower double-layer steel truss bridge. The results show that strategy A, which employs temperature time series data as input, is less effective than strategy B. Incorporating both temperature and deflection data as inputs is essential for predicting temperature-induced deflections. Moreover, the Transformer-variant network generally exhibits superior prediction performance compared to the LSTM and Bi-LSTM. The self-attention mechanism of the Transformer allows it to focus on key historical temperature points, thereby enhancing prediction accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
13. 融合快速边缘注意力的Transformer 跟踪算法.
- Author
-
薛紫涵, 葛海波, 王淑贤, 安玉, and 杨雨迪
- Subjects
TRACKING algorithms ,FEATURE extraction ,PROBLEM solving ,MULTILAYER perceptrons - Abstract
Copyright of Journal of Computer Engineering & Applications is the property of Beijing Journal of Computer Engineering & Applications Journal Co Ltd. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2025
- Full Text
- View/download PDF
14. An Intelligent Maneuver Decision-Making Approach for Air Combat Based on Deep Reinforcement Learning and Transformer Networks.
- Author
-
Li, Wentao, Fang, Feng, Peng, Dongliang, and Han, Shuning
- Subjects
- *
DEEP reinforcement learning , *PROBLEM solving , *TIME series analysis , *DECISION making , *SAMPLING methods - Abstract
The traditional maneuver decision-making approaches are highly dependent on accurate and complete situation information, and their decision-making quality becomes poor when opponent information is occasionally missing in complex electromagnetic environments. In order to solve this problem, an autonomous maneuver decision-making approach is developed based on deep reinforcement learning (DRL) architecture. Meanwhile, a Transformer network is integrated into the actor and critic networks, which can find the potential dependency relationships among the time series trajectory data. By using these relationships, the information loss is partially compensated, which leads to maneuvering decisions being more accurate. The issues of limited experience samples, low sampling efficiency, and poor stability in the agent training state appear when the Transformer network is introduced into DRL. To address these issues, the measures of designing an effective decision-making reward, a prioritized sampling method, and a dynamic learning rate adjustment mechanism are proposed. Numerous simulation results show that the proposed approach outperforms the traditional DRL algorithms, with a higher win rate in the case of opponent information loss. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
15. Enhanced bearing RUL prediction based on dynamic temporal attention and mixed MLP
- Author
-
Zhongtian Jin, Chong Chen, Aris Syntetos, and Ying Liu
- Subjects
Deep learning ,Remaining useful life ,Prognostic and health management ,Transformer network ,Electronic computers. Computer science ,QA75.5-76.95 ,Computer engineering. Computer hardware ,TK7885-7895 - Abstract
Abstract Bearings are critical components in machinery, and accurately predicting their remaining useful life (RUL) is essential for effective predictive maintenance. Traditional RUL prediction methods often rely on manual feature extraction and expert knowledge, which face specific challenges such as handling non-stationary data and avoiding overfitting due to the inclusion of numerous irrelevant features. This paper presents an approach that leverages Continuous Wavelet Transform (CWT) for feature extraction, a Channel-Temporal Mixed MLP (CT-MLP) layer for capturing intricate dependencies, and a dynamic attention mechanism to adjust its focus based on the temporal importance of features within the time series. The dynamic attention mechanism integrates multi-head attention with innovative enhancements, making it particularly effective for datasets exhibiting non-stationary behaviour. An experimental study using the XJTU-SY rolling bearings dataset and the PRONOSTIA bearing dataset revealed that the proposed deep learning algorithm significantly outperforms other state-of-the-art algorithms in terms of RMSE and MAE, demonstrating its robustness and accuracy.
- Published
- 2025
- Full Text
- View/download PDF
16. Train wheelset bearing damage identification method based on convolution and transformer fusion framework
- Author
-
Feiyue DENG, Yulong CAI, Rui WANG, and Shouxi ZHENG
- Subjects
wheelset bearing ,damage identification ,convolutional network ,transformer network ,multi-scale feature ,Mining engineering. Metallurgy ,TN1-997 ,Environmental engineering ,TA170-171 - Abstract
To address the issues of image feature insensitivity, high requirement of expert experience, and low recognition accuracy of traditional machine vision methods in train wheelset bearing damage detection, this paper proposes an identification method based on the framework of convolutional and transformer fusion networks for identifying damage to train wheelset bearings. First, due to the complexity of train-bearing images, their category imbalance is more severe; an image preprocessing method called image enhancement category reorganization is used to improve the quality of the acquired image dataset and eliminate the effects of the imbalance dataset. Second, a convolutional neural network (CNN) has high model construction and training efficiency due to adopting a local sensing field and weight-sharing strategy, which can only sense local neighborhoods but has limited ability to capture global feature information. Transformer is a network model based on a self-attention mechanism. With strong parallel computing ability, it can learn the remote dependencies between image pixels in the global scope and has a more powerful global information extraction ability. However, the ability to mine the local features of the image is not sufficient. Therefore, this paper presents a VGG and transformer parallel fusion network that integrates the global contour features and local details of the image based on the fusion of convolution and self-attention. Furthermore, a multiscale dilation spatial pyramid convolution (MDSPC) module is constructed to fully mine the multiscale semantic features in the feature map using multiscale dilation convolution progressive fusion. The proposed method effectively solves the problem of feature information loss due to the mesh effect caused by the expansion convolution. Additionally, embedding coordinate attention (CA) after the MDSPC module can obtain remote dependencies and more precise positional relationships of feature images from two spatial directions, which can more accurately focus on specific regions in the feature map. Finally, experimental analyses were conducted using the NEU-DET image defect and self-constructed train wheelset bearing image datasets. The experimental results demonstrate that the proposed model has an accuracy of 99.44% and 98% for recognizing six types of defects and four types of images of wheelset bearings in NEU-DET data, respectively. The feature extraction capability of the proposed model was verified using model visualization methods. Compared with existing CNN models, ViT model with self-attention mechanism, and CNN-transformer fusion model, the proposed method shows significantly better evaluation metrics and accurately identifies different types of image samples without significantly increasing the model complexity.
- Published
- 2024
- Full Text
- View/download PDF
17. An investigation on energy-saving scheduling algorithm of wireless monitoring sensors in oil and gas pipeline networks
- Author
-
Zhifeng Ma, Zhanjun Hao, and Zhenya Zhao
- Subjects
Wireless sensor network ,Oil and gas pipeline network ,Energy-saving scheduling ,Transformer network ,Energy efficiency ,Energy industries. Energy policy. Fuel trade ,HD9502-9502.5 - Abstract
Abstract With the rapid development of the oil and gas industry, monitoring the safety and efficiency of pipeline networks has become particularly important. In this context, Wireless Sensor Networks (WSNs) are widely used for monitoring oil and gas pipelines due to their flexible deployment and cost-effectiveness. However, since sensor nodes typically rely on limited battery power, extending the network’s lifecycle and improving energy utilization efficiency have become focal points of research. Therefore, this paper proposes an energy-saving scheduling algorithm based on transformer networks, aimed at optimizing energy consumption and data transmission efficiency of wireless monitoring sensors in oil and gas pipelines. Firstly, this study designs a deep learning-based Transformer model that learns from historical data on energy consumption patterns and environmental variables to predict the energy and data transmission needs of each sensor node. Secondly, based on the prediction results, this algorithm employs a dynamic scheduling strategy that automatically adjusts the sensor’s operational mode and communication frequency according to the node’s energy status and task urgency. Additionally, we have validated the effectiveness of the proposed algorithm through field tests and simulation experiments. According to the experimental results, our model has higher efficiency in energy saving. Compared with Convolutional Neural Networks, Recurrent Neural Networks and Graph Neural Networks, the total energy consumption of sensor networks under the model scheduling in this paper was reduced by 6.7%, 33.4% and 26.3%, respectively. Our algorithms improve the energy efficiency and stability of the monitoring system and provide important technical support for future intelligent pipeline monitoring systems. We hope this paper will inspire future scientific research in this field.
- Published
- 2024
- Full Text
- View/download PDF
18. 基于Transformer 的盾构泥水舱液位智能预测与控制.
- Author
-
卢 靖, 李 刚, 胡 珉, 王 伊, and 刘玲玲
- Abstract
Copyright of Tunnel Construction / Suidao Jianshe (Zhong-Yingwen Ban) is the property of Tunnel Construction Editorial Office and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
19. 认知传感网中基于 Transformer 网络的 MAC 协议识别方法.
- Author
-
赵 立, 赵宏坚, 高智伟, 王黎明, 刘 越, 罗 渝, and 廖 勇
- Subjects
TRANSFORMER models ,SPECTRUM allocation ,COGNITIVE radio ,SENSOR networks ,TELECOMMUNICATION systems ,DEEP learning ,FEEDFORWARD neural networks - Abstract
Copyright of Telecommunication Engineering is the property of Telecommunication Engineering and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
20. An investigation on energy-saving scheduling algorithm of wireless monitoring sensors in oil and gas pipeline networks.
- Author
-
Ma, Zhifeng, Hao, Zhanjun, and Zhao, Zhenya
- Subjects
GRAPH neural networks ,TRANSFORMER models ,WIRELESS sensor networks ,CONVOLUTIONAL neural networks ,SENSOR networks ,DEEP learning - Abstract
With the rapid development of the oil and gas industry, monitoring the safety and efficiency of pipeline networks has become particularly important. In this context, Wireless Sensor Networks (WSNs) are widely used for monitoring oil and gas pipelines due to their flexible deployment and cost-effectiveness. However, since sensor nodes typically rely on limited battery power, extending the network's lifecycle and improving energy utilization efficiency have become focal points of research. Therefore, this paper proposes an energy-saving scheduling algorithm based on transformer networks, aimed at optimizing energy consumption and data transmission efficiency of wireless monitoring sensors in oil and gas pipelines. Firstly, this study designs a deep learning-based Transformer model that learns from historical data on energy consumption patterns and environmental variables to predict the energy and data transmission needs of each sensor node. Secondly, based on the prediction results, this algorithm employs a dynamic scheduling strategy that automatically adjusts the sensor's operational mode and communication frequency according to the node's energy status and task urgency. Additionally, we have validated the effectiveness of the proposed algorithm through field tests and simulation experiments. According to the experimental results, our model has higher efficiency in energy saving. Compared with Convolutional Neural Networks, Recurrent Neural Networks and Graph Neural Networks, the total energy consumption of sensor networks under the model scheduling in this paper was reduced by 6.7%, 33.4% and 26.3%, respectively. Our algorithms improve the energy efficiency and stability of the monitoring system and provide important technical support for future intelligent pipeline monitoring systems. We hope this paper will inspire future scientific research in this field. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
21. Exploring the Capability of Kernel- and Correlation-Based Learning on PCB Component Segmentation.
- Author
-
Al Hasan, Md Mahfuz, Varshney, Nitin, Jessurun, Nathan, Forghani, Reza, and Asadizanjani, Navid
- Subjects
- *
ARTIFICIAL neural networks , *IMAGE segmentation , *TRANSFORMER models , *DEEP learning , *PRINTED circuits , *PRINTED circuit design - Abstract
Due to the continuous increase in the globalized outsourcing of printed circuit board (PCB) fabrication, PCB counterfeits have increased by a significant margin, necessitating rapid and advanced hardware assurance techniques. PCB image segmentation is the primary step in PCB assurance. Over the years, few PCB component segmentation methods have been proposed, and none of those has provided a definite performance benchmark. Besides, those methods have not discussed how the performance is correlated with underlying data or annotation quality. This work presents a PCB image segmentation benchmark. In addition, we explore how annotation quality affects component segmentation and present possible future research directions to work with coarse annotations to alleviate the human effort behind full data annotation tasks. We have analyzed the performance of the preferred deep neural network (DNN) architecture and Transformer architecture with the data annotation quality and presented the direction to leverage the outcome with limited quality annotations. Finally, we present the qualitative as well as the quantitative results to demonstrate the performance of our techniques and provide observations and future research directions on the overall task. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. CTRNet: An Automatic Modulation Recognition Based on Transformer-CNN Neural Network.
- Author
-
Zhang, Wenna, Xue, Kailiang, Yao, Aiqin, and Sun, Yunqiang
- Subjects
PATTERN recognition systems ,RECURRENT neural networks ,CONVOLUTIONAL neural networks ,DEEP learning ,TRANSFORMER models - Abstract
Deep learning (DL) has brought new perspectives and methods to automatic modulation recognition (AMR), enabling AMR systems to operate more efficiently and reliably in modern wireless communication environments through its powerful feature learning and complex pattern recognition capabilities. However, convolutional neural networks (CNNs) and recurrent neural networks (RNNs), which are used for sequence recognition tasks, face two main challenges, respectively: the ineffective utilization of global information and slow processing speeds due to sequential operations. To address these issues, this paper introduces CTRNet, a novel automatic modulation recognition network that combines a CNN with Transformer. This combination leverages Transformer's ability to adequately capture the long-distance dependencies between global sequences and its advantages in sequence modeling, along with the CNN's capability to extract features from local feature regions of signals. During the data preprocessing stage, the original IQ-modulated signals undergo sliding-window processing. By selecting the appropriate window sizes and strides, multiple subsequences are formed, enabling the network to effectively handle complex modulation patterns. In the embedding module, token vectors are designed to integrate information from multiple samples within each window, enhancing the model's understanding and modeling ability of global information. In the feedforward neural network, a more effective Bilinear layer is employed for processing to capture the higher-order relationship between input features, thereby enhancing the ability of the model to capture complex patterns. Experiments conducted on the RML2016.10A public dataset demonstrate that compared with the existing algorithms, the proposed algorithm not only exhibits significant advantages in terms of parameter efficiency but also achieves higher recognition accuracy under various signal-to-noise ratio (SNR) conditions. In particular, it performs relatively well in terms of accuracy, precision, recall, and F1-score, with clearer classification of higher-order modulations and notable overall accuracy improvement. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. Transformer-based correction scheme for short-term bus load prediction in holidays.
- Author
-
Tang Ningkai, Lu Jixiang, Chen Tianyu, Shu Jiao, Chang Li, and Chen Tao
- Abstract
Copyright of Journal of Southeast University (English Edition) is the property of Journal of Southeast University Editorial Office and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
24. Data-Driven AI Model for Turbomachinery Compressor Aerodynamics Enabling Rapid Approximation of 3D Flow Solutions.
- Author
-
Aulich, Marcel, Goinis, Georgios, and Voß, Christian
- Subjects
ARTIFICIAL neural networks ,ARTIFICIAL intelligence ,COMPUTATIONAL fluid dynamics ,TRANSFORMER models ,CURIOSITY - Abstract
The development of new turbomachinery designs requires numerous time-consuming and computationally intensive computational fluid dynamics (CFD) calculations. However, most of the generated high spatial resolution data remain unused at later development steps. That is also the case with automated optimization processes that use only a few integral values to determine objectives and constraints. To make further use of this vast amount of CFD data a data-driven AI model based on the Transformer architecture is developed and trained using the available CFD data. The presented method subsequently provides a fast approximation of the 3D flow for new designs. In this paper, the structure of the developed AI model is presented and the approximation quality is analyzed using a complex, state-of-the-art compressor test case. It is shown that the AI model can reproduce many characteristics of the 3D flow of new designs, and performance measures such as efficiency can be derived from these flow predictions. In addition, the complex test case revealed that greater design variation reduces the AI approximation quality which can lead to undesirable exploratory behavior in an optimization setup. Overall, the test case has shown promising results and has provided hints for further improvements to the AI model. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
25. 改进的密集视频描述 Transformer 译码算法.
- Author
-
杨大伟, 盘晓芳, 毛琳, and 张汝波
- Abstract
Copyright of Journal of Computer Engineering & Applications is the property of Beijing Journal of Computer Engineering & Applications Journal Co Ltd. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
26. Generating Genre-Based Automatic Feedback on English for Research Publication Purposes.
- Author
-
Link, Stephanie, Redmon, Robert, Shamsi, Yaser, and Hagan, Martin
- Subjects
NATURAL language processing ,LANGUAGE models ,COMPUTER assisted language instruction ,ARTIFICIAL intelligence ,TRANSFORMER models - Abstract
Artificial intelligence (AI) for supporting second language (L2) writing processes and practices has garnered increasing interest in recent years, establishing AI-mediated L2 writing as a new norm for many multilingual classrooms. As such, the emergence of AI-mediated technologies has challenged L2 writing instructors and their philosophies regarding computer-assisted language learning (CALL) and teaching. Technologies that can combine principled pedagogical practices and the benefits of AI can help to change the landscape of L2 writing instruction while maintaining the integrity of knowledge production that is so important to CALL instructors. To align L2 instructional practices and CALL technologies, we discuss the development of an AI-mediated L2 writing technology that leverages genre-based instruction (GBI) and large language models to provide L2 writers and instructors with tools to enhance English for research publication purposes. Our work reports on the accuracy, precision, and recall of our network classification, which surpass previously reported research in the field of genre-based automated writing evaluation by offering a faster network training approach with higher accuracy of feedback provision and new beginnings for genre-based learning systems. Implications for tool development and GBI are discussed. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
27. Well googled is half done: Multimodal forecasting of new fashion product sales with image‐based google trends.
- Author
-
Skenderi, Geri, Joppi, Christian, Denitto, Matteo, and Cristani, Marco
- Subjects
PRODUCT image ,NEW product development ,TIME series analysis ,FORECASTING ,METADATA ,SALES forecasting - Abstract
New fashion product sales forecasting is a challenging problem that involves many business dynamics and cannot be solved by classical forecasting approaches. In this paper, we investigate the effectiveness of systematically probing exogenous knowledge in the form of Google Trends time series and combining it with multi‐modal information related to a brand‐new fashion item, in order to effectively forecast its sales despite the lack of past data. In particular, we propose a neural network‐based approach, where an encoder learns a representation of the exogenous time series, while the decoder forecasts the sales based on the Google Trends encoding and the available visual and metadata information. Our model works in a non‐autoregressive manner, avoiding the compounding effect of large first‐step errors. As a second contribution, we present VISUELLE, a publicly available dataset for the task of new fashion product sales forecasting, containing multimodal information for 5,577 real, new products sold between 2016 and 2019 from Nunalie, an Italian fast‐fashion company. The dataset is equipped with images of products, metadata, related sales, and associated Google Trends. We use VISUELLE to compare our approach against state‐of‐the‐art alternatives and several baselines, showing that our neural network‐based approach is the most accurate in terms of both percentage and absolute error. It is worth noting that the addition of exogenous knowledge boosts the forecasting accuracy by 1.5% in terms of Weighted Absolute Percentage Error (WAPE), revealing the importance of exploiting informative external information. The code and dataset are both available online (at https://github.com/HumaticsLAB/GTM-Transformer). [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
28. Transformer-style convolution network for lightweight image super-resolution
- Author
-
Gendy, Garas and Sabor, Nabil
- Published
- 2025
- Full Text
- View/download PDF
29. Automatic Lip Reading of Persian Words by a Robotic System Using Deep Learning Algorithms
- Author
-
Gholipour, Amir, Mohammadzade, Hoda, Ghadami, Ali, and Taheri, Alireza
- Published
- 2024
- Full Text
- View/download PDF
30. Autism spectrum disorders detection based on multi-task transformer neural network
- Author
-
Le Gao, Zhimin Wang, Yun Long, Xin Zhang, Hexing Su, Yong Yu, and Jin Hong
- Subjects
Autism Spectrum Disorders ,Artificial intelligence ,Biological information ,Multi-task learning ,Transformer network ,Neurosciences. Biological psychiatry. Neuropsychiatry ,RC321-571 ,Neurophysiology and neuropsychology ,QP351-495 - Abstract
Abstract Autism Spectrum Disorders (ASD) are neurodevelopmental disorders that cause people difficulties in social interaction and communication. Identifying ASD patients based on resting-state functional magnetic resonance imaging (rs-fMRI) data is a promising diagnostic tool, but challenging due to the complex and unclear etiology of autism. And it is difficult to effectively identify ASD patients with a single data source (single task). Therefore, to address this challenge, we propose a novel multi-task learning framework for ASD identification based on rs-fMRI data, which can leverage useful information from multiple related tasks to improve the generalization performance of the model. Meanwhile, we adopt an attention mechanism to extract ASD-related features from each rs-fMRI dataset, which can enhance the feature representation and interpretability of the model. The results show that our method outperforms state-of-the-art methods in terms of accuracy, sensitivity and specificity. This work provides a new perspective and solution for ASD identification based on rs-fMRI data using multi-task learning. It also demonstrates the potential and value of machine learning for advancing neuroscience research and clinical practice.
- Published
- 2024
- Full Text
- View/download PDF
31. Phishing Webpage Detection via Multi-Modal Integration of HTML DOM Graphs and URL Features Based on Graph Convolutional and Transformer Networks.
- Author
-
Yoon, Jun-Ho, Buu, Seok-Jun, and Kim, Hae-Jung
- Subjects
CONVOLUTIONAL neural networks ,DEEP learning ,TRANSFORMER models ,INTERNET safety ,PHISHING - Abstract
Detecting phishing webpages is a critical task in the field of cybersecurity, with significant implications for online safety and data protection. Traditional methods have primarily relied on analyzing URL features, which can be limited in capturing the full context of phishing attacks. In this study, we propose an innovative approach that integrates HTML DOM graph modeling with URL feature analysis using advanced deep learning techniques. The proposed method leverages Graph Convolutional Networks (GCNs) to model the structure of HTML DOM graphs, combined with Convolutional Neural Networks (CNNs) and Transformer Networks to capture the character and word sequence features of URLs, respectively. These multi-modal features are then integrated using a Transformer network, which is adept at selectively capturing the interdependencies and complementary relationships between different feature sets. We evaluated our approach on a real-world dataset comprising URL and HTML DOM graph data collected from 2012 to 2024. This dataset includes over 80 million nodes and edges, providing a robust foundation for testing. Our method demonstrated a significant improvement in performance, achieving a 7.03 percentage point increase in classification accuracy compared to state-of-the-art techniques. Additionally, we conducted ablation tests to further validate the effectiveness of individual features in our model. The results validate the efficacy of integrating HTML DOM structure and URL features using deep learning. Our framework significantly enhances phishing detection capabilities, providing a more accurate and comprehensive solution to identifying malicious webpages. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
32. 基于Transformer网络特征融合的色纺织物图像检索.
- Author
-
沈佳忱, 袁 理, 廖海斌, 王 闵, and 郭 旻
- Subjects
CONVOLUTIONAL neural networks ,ARTIFICIAL neural networks ,TRANSFORMER models ,IMAGE retrieval ,IMAGE fusion - Abstract
Copyright of Wool Textile Journal is the property of National Wool Textile Science & Technology Information Center and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
33. DATFNets-dynamic adaptive assigned transformer network for fire detection.
- Author
-
Wang, Zuoxin, Zhao, Xiaohu, and Li, Dunqing
- Subjects
CONVOLUTIONAL neural networks ,FIRE management ,FIRE prevention ,JOB classification - Abstract
Fires cause severe damage to the ecological environment and threaten human life and property. Although the traditional convolutional neural network method effectively detects large-area fires, it cannot capture small fires in complex areas through a limited receptive field. At the same time, fires can change at any time due to the influence of wind direction, which challenges fire prevention and control personnel. To solve these problems, a novel dynamic adaptive distribution transformer detection framework is proposed to help firefighters and researchers develop optimal fire management strategies. On the one hand, this framework embeds a context aggregation layer with a masking strategy in the feature extractor to improve the representation of low-level and salient features. The masking strategy can reduce irrelevant information and improve network generalization. On the other hand, designed a dynamic adaptive direction conversion function and sample allocation strategy to fully use adaptive point representation while achieving accurate positioning and classification of fires and screening out representative fire samples in complex backgrounds. In addition, to prevent the network from being limited to the local optimum and discrete points in the sample from causing severe interference to the overall performance, designed a weighted loss function with spatial constraints to optimize the network and penalize the discrete points in the sample. The mAP in the three baseline data sets of FireDets, WildFurgFires, and FireAndSmokes are 0.871, 0.909, and 0.955, respectively. The experimental results are significantly better than other detection methods, which proves that the proposed method has good robustness and detection performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
34. Power Quality Transient Disturbance Diagnosis Based on Dynamic Large Convolution Kernel and Multi-Level Feature Fusion Network.
- Author
-
Zheng, Chen, Li, Qionglin, Liu, Shuming, Dai, Shuangyin, Zhang, Bo, and Liu, Yajuan
- Subjects
- *
POWER quality disturbances , *ELECTRIC transients , *CONVOLUTIONAL neural networks , *TRANSFORMER models , *FEATURE extraction , *KERNEL (Mathematics) - Abstract
Power quality is an important metric for the normal operation of a power system, and the accurate identification of transient signals is of great significance for the improvement of power quality. The diverse types of power system transient signals and strong characteristic coupling brings new challenges to the analysis and identification of power system transient signals. In order to enhance the identification accuracy of transient signals, one method of power system transient signal identification is proposed based on a dynamic large convolution kernel and multilevel feature fusion network. First, the more fine-grained and more informative features of the transient signals are extracted by the dynamic large convolution kernel feature extraction module. Then, the multi-scale local features are adaptively fused by the multilevel feature fusion module. Finally, the fused features are reduced in dimension by the fully connected layer in the classification module and fed into the SoftMax layer for transient signal type detection. The proposed method can effectively improve the small receptive field problem of convolutional neural networks and the lack of ability of Transformer network in extracting local context information. Compared with five other power quality transient disturbance identification models, the experimental results show that the proposed method has better diagnostic accuracy and anti-noise capability. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
35. Color characterization model of colored fabric based on Transformer network feature fusion.
- Author
-
WU Xinru, YUAN Li, WANG Win, GUO Min, ZHU Lanyan, and WANG Jing
- Subjects
FEATURE extraction ,MULTISENSOR data fusion ,STATISTICAL correlation - Abstract
To address the unique color structure of colored fabric and dge limitations of commonly used single color measurdlnent toollr; a multi-source heterogeneous data fusion color representation model was established based on Transformer network. Spectral and texture features was extracted from spwtrometer data and image data respectively, Transformer network was used to fuse multi-source heterogeneous data features, their complementary characteristics was fully utilized, and the color information of colored fabric was effectively and comprehensively characterized. The results indicate that the color characterization model constructed in this article can effectively characterize the differences in the quality ratio ^ dyed fibers within a large range of changes and the color changes due to uneven distribution on the surface of dyed fibers. At the same time, under the measurement aperture of 6mm, 10mm, and 25mm, the correlation coefficients between the fusion feature difference and the fiber ratio difference are all higher than 85%. Compared with a single spectral feature and a single image feature, the correlation coefficient of this method is improved by more than 10%, which has ideal robustness. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
36. Autism spectrum disorders detection based on multi-task transformer neural network.
- Author
-
Gao, Le, Wang, Zhimin, Long, Yun, Zhang, Xin, Su, Hexing, Yu, Yong, and Hong, Jin
- Subjects
AUTISM spectrum disorders ,TRANSFORMER models ,FUNCTIONAL magnetic resonance imaging ,MACHINE learning ,CLINICAL neurosciences - Abstract
Autism Spectrum Disorders (ASD) are neurodevelopmental disorders that cause people difficulties in social interaction and communication. Identifying ASD patients based on resting-state functional magnetic resonance imaging (rs-fMRI) data is a promising diagnostic tool, but challenging due to the complex and unclear etiology of autism. And it is difficult to effectively identify ASD patients with a single data source (single task). Therefore, to address this challenge, we propose a novel multi-task learning framework for ASD identification based on rs-fMRI data, which can leverage useful information from multiple related tasks to improve the generalization performance of the model. Meanwhile, we adopt an attention mechanism to extract ASD-related features from each rs-fMRI dataset, which can enhance the feature representation and interpretability of the model. The results show that our method outperforms state-of-the-art methods in terms of accuracy, sensitivity and specificity. This work provides a new perspective and solution for ASD identification based on rs-fMRI data using multi-task learning. It also demonstrates the potential and value of machine learning for advancing neuroscience research and clinical practice. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
37. Activated Sparsely Sub-Pixel Transformer for Remote Sensing Image Super-Resolution.
- Author
-
Guo, Yongde, Gong, Chengying, and Yan, Jun
- Subjects
- *
TRANSFORMER models , *HIGH resolution imaging , *FEATURE selection , *CODING theory - Abstract
Transformers have recently achieved significant breakthroughs in various visual tasks. However, these methods often overlook the optimization of interactions between convolution and transformer blocks. Although the basic attention module strengthens the feature selection ability, it is still weak in generating superior quality output. In order to address this challenge, we propose the integration of sub-pixel space and the application of sparse coding theory in the calculation of self-attention. This approach aims to enhance the network's generation capability, leading to the development of a sparse-activated sub-pixel transformer network (SSTNet). The experimental results show that compared with several state-of-the-art methods, our proposed network can obtain better generation results, improving the sharpness of object edges and the richness of detail texture information in super-resolution generated images. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
38. Enhancing Automatic Modulation Recognition for IoT Applications Using Transformers.
- Author
-
Rashvand, Narges, Witham, Kenneth, Maldonado, Gabriel, Katariya, Vinit, Marer Prabhu, Nishanth, Schirner, Gunar, and Tabkhi, Hamed
- Subjects
INTERNET of things ,TRANSFORMER models ,NATURAL language processing ,EDGE computing ,DEEP learning - Abstract
Automatic modulation recognition (AMR) is vital for accurately identifying modulation types within incoming signals, a critical task for optimizing operations within edge devices in IoT ecosystems. This paper presents an innovative approach that leverages Transformer networks, initially designed for natural language processing, to address the challenges of efficient AMR. Our Transformer network architecture is designed with the mindset of real-time edge computing on IoT devices. Four tokenization techniques are proposed and explored for creating proper embeddings of RF signals, specifically focusing on overcoming the limitations related to the model size often encountered in IoT scenarios. Extensive experiments reveal that our proposed method outperformed advanced deep learning techniques, achieving the highest recognition accuracy. Notably, our model achieved an accuracy of 65.75 on the RML2016 and 65.80 on the CSPB.ML.2018+ dataset. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
39. Energy Consumption Prediction Method for Refrigeration Systems Based on Adversarial Networks and Transformer Networks
- Author
-
Zhang, Hu, Liu, Huifeng, Zhang, Youli, Guo, Ying, Dai, Hongjun, Shao, Minghao, Xu, Hongyu, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Cao, Cungeng, editor, Chen, Huajun, editor, Zhao, Liang, editor, Arshad, Junaid, editor, Asyhari, Taufiq, editor, and Wang, Yonghao, editor
- Published
- 2024
- Full Text
- View/download PDF
40. Transformation Network Model for Ear Recognition
- Author
-
Booysens, Aimee, Viriri, Serestina, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, van Leeuwen, Jan, Series Editor, Hutchison, David, Editorial Board Member, Kanade, Takeo, Editorial Board Member, Kittler, Josef, Editorial Board Member, Kleinberg, Jon M., Editorial Board Member, Kobsa, Alfred, Series Editor, Mattern, Friedemann, Editorial Board Member, Mitchell, John C., Editorial Board Member, Naor, Moni, Editorial Board Member, Nierstrasz, Oscar, Series Editor, Pandu Rangan, C., Editorial Board Member, Sudan, Madhu, Series Editor, Terzopoulos, Demetri, Editorial Board Member, Tygar, Doug, Editorial Board Member, Weikum, Gerhard, Series Editor, Vardi, Moshe Y, Series Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Renault, Éric, editor, Boumerdassi, Selma, editor, and Mühlethaler, Paul, editor
- Published
- 2024
- Full Text
- View/download PDF
41. Compact Convolutional Transformer for Bearing Remaining Useful Life Prediction
- Author
-
Jin, Zhongtian, Chen, Chong, Liu, Qingtao, Syntetos, Aris, Liu, Ying, Chaari, Fakher, Series Editor, Gherardini, Francesco, Series Editor, Ivanov, Vitalii, Series Editor, Haddar, Mohamed, Series Editor, Cavas-Martínez, Francisco, Editorial Board Member, di Mare, Francesca, Editorial Board Member, Kwon, Young W., Editorial Board Member, Tolio, Tullio A. M., Editorial Board Member, Trojanowska, Justyna, Editorial Board Member, Schmitt, Robert, Editorial Board Member, Xu, Jinyang, Editorial Board Member, Fera, Marcello, editor, Caterino, Mario, editor, Macchiaroli, Roberto, editor, and Pham, Duc Truong, editor
- Published
- 2024
- Full Text
- View/download PDF
42. Learning Bottleneck Transformer for Event Image-Voxel Feature Fusion Based Classification
- Author
-
Yuan, Chengguo, Jin, Yu, Wu, Zongzhen, Wei, Fanting, Wang, Yangzirui, Chen, Lan, Wang, Xiao, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Liu, Qingshan, editor, Wang, Hanzi, editor, Ma, Zhanyu, editor, Zheng, Weishi, editor, Zha, Hongbin, editor, Chen, Xilin, editor, Wang, Liang, editor, and Ji, Rongrong, editor
- Published
- 2024
- Full Text
- View/download PDF
43. Multi-step prediction of offshore wind power based on Transformer network and Huber loss
- Author
-
Haoyi Xiao, Xiaoxia He, and Chunli Li
- Subjects
Offshore wind power prediction ,Transformer network ,Huber loss function ,Autoencoder ,Slime mould optimization algorithm ,Multi-step prediction ,Production of electric energy or power. Powerplants. Central stations ,TK1001-1841 - Abstract
In the context of the burgeoning expansion of renewable energy sources, the precise prediction of offshore wind power assumes a pivotal role in safeguarding the reliability, economic viability, and sustainable progression of offshore wind farms. The present study introduces a novel methodology for offshore wind power prediction, predicated upon the synergy of the Transformer network and Huber loss function. Empirical validation is conducted utilizing authentic data from a European offshore wind farm. The resulting analyses delineate a discernible superiority of the Transformer network over classical LSTM and GRU models in capturing the intricate long-term dependencies intrinsic to the time series. Furthermore, the inclusion of the Huber loss function effectively mitigates the challenges posed by the high volatility often characteristic of offshore wind power data. The study also demonstrates the beneficial integration of autoencoder reconstruction for denoising and slime mould optimization algorithm to augment prediction performance. Distinctively diverging from traditional single-step prediction paradigms, the multi-step prediction model constructed within this research offers a more comprehensive and precise prediction of wind power. Such an innovative approach represents a valuable contribution to the field, with tangible implications for the dependable operation and future advancement of wind power.
- Published
- 2024
- Full Text
- View/download PDF
44. CAT-Net: A Cross-Slice Attention Transformer Model for Prostate Zonal Segmentation in MRI
- Author
-
Hung, Alex Ling Yu, Zheng, Haoxin, Miao, Qi, Raman, Steven S, Terzopoulos, Demetri, and Sung, Kyunghyun
- Subjects
Information and Computing Sciences ,Biomedical Imaging ,Prostate Cancer ,Cancer ,Urologic Diseases ,Aging ,Humans ,Male ,Prostate ,Image Processing ,Computer-Assisted ,Magnetic Resonance Imaging ,Prostatic Neoplasms ,Pelvis ,Image segmentation ,Transformers ,Three-dimensional displays ,Magnetic resonance imaging ,Standards ,Image resolution ,Decoding ,Attention mechanism ,deep learning ,magnetic resonance imaging ,prostate zonal segmentation ,transformer network ,Engineering ,Nuclear Medicine & Medical Imaging ,Information and computing sciences - Abstract
Prostate cancer is the second leading cause of cancer death among men in the United States. The diagnosis of prostate MRI often relies on accurate prostate zonal segmentation. However, state-of-the-art automatic segmentation methods often fail to produce well-contained volumetric segmentation of the prostate zones since certain slices of prostate MRI, such as base and apex slices, are harder to segment than other slices. This difficulty can be overcome by leveraging important multi-scale image-based information from adjacent slices, but current methods do not fully learn and exploit such cross-slice information. In this paper, we propose a novel cross-slice attention mechanism, which we use in a Transformer module to systematically learn cross-slice information at multiple scales. The module can be utilized in any existing deep-learning-based segmentation framework with skip connections. Experiments show that our cross-slice attention is able to capture cross-slice information significant for prostate zonal segmentation in order to improve the performance of current state-of-the-art methods. Cross-slice attention improves segmentation accuracy in the peripheral zones, such that segmentation results are consistent across all the prostate slices (apex, mid-gland, and base). The code for the proposed model is available at https://bit.ly/CAT-Net.
- Published
- 2023
45. DATFNets-dynamic adaptive assigned transformer network for fire detection
- Author
-
Zuoxin Wang, Xiaohu Zhao, and Dunqing Li
- Subjects
Fire detection ,Fire management ,Transformer network ,Global contextual semantics ,Feature extractor ,Spatial constraints ,Electronic computers. Computer science ,QA75.5-76.95 ,Information technology ,T58.5-58.64 - Abstract
Abstract Fires cause severe damage to the ecological environment and threaten human life and property. Although the traditional convolutional neural network method effectively detects large-area fires, it cannot capture small fires in complex areas through a limited receptive field. At the same time, fires can change at any time due to the influence of wind direction, which challenges fire prevention and control personnel. To solve these problems, a novel dynamic adaptive distribution transformer detection framework is proposed to help firefighters and researchers develop optimal fire management strategies. On the one hand, this framework embeds a context aggregation layer with a masking strategy in the feature extractor to improve the representation of low-level and salient features. The masking strategy can reduce irrelevant information and improve network generalization. On the other hand, designed a dynamic adaptive direction conversion function and sample allocation strategy to fully use adaptive point representation while achieving accurate positioning and classification of fires and screening out representative fire samples in complex backgrounds. In addition, to prevent the network from being limited to the local optimum and discrete points in the sample from causing severe interference to the overall performance, designed a weighted loss function with spatial constraints to optimize the network and penalize the discrete points in the sample. The mAP in the three baseline data sets of FireDets, WildFurgFires, and FireAndSmokes are 0.871, 0.909, and 0.955, respectively. The experimental results are significantly better than other detection methods, which proves that the proposed method has good robustness and detection performance.
- Published
- 2024
- Full Text
- View/download PDF
46. Enhancing Automatic Modulation Recognition for IoT Applications Using Transformers
- Author
-
Narges Rashvand, Kenneth Witham, Gabriel Maldonado, Vinit Katariya, Nishanth Marer Prabhu, Gunar Schirner, and Hamed Tabkhi
- Subjects
automatic modulation recognition ,deep learning ,attention mechanism ,Transformer network ,IoT ,Computer software ,QA76.75-76.765 ,Technology ,Cybernetics ,Q300-390 - Abstract
Automatic modulation recognition (AMR) is vital for accurately identifying modulation types within incoming signals, a critical task for optimizing operations within edge devices in IoT ecosystems. This paper presents an innovative approach that leverages Transformer networks, initially designed for natural language processing, to address the challenges of efficient AMR. Our Transformer network architecture is designed with the mindset of real-time edge computing on IoT devices. Four tokenization techniques are proposed and explored for creating proper embeddings of RF signals, specifically focusing on overcoming the limitations related to the model size often encountered in IoT scenarios. Extensive experiments reveal that our proposed method outperformed advanced deep learning techniques, achieving the highest recognition accuracy. Notably, our model achieved an accuracy of 65.75 on the RML2016 and 65.80 on the CSPB.ML.2018+ dataset.
- Published
- 2024
- Full Text
- View/download PDF
47. Shots segmentation-based optimized dual-stream framework for robust human activity recognition in surveillance video
- Author
-
Altaf Hussain, Samee Ullah Khan, Noman Khan, Waseem Ullah, Ahmed Alkhayyat, Meshal Alharbi, and Sung Wook Baik
- Subjects
Activity Recognition ,Video Classification ,Surveillance System ,Lowlight Image Enhancement ,Dual Stream Network ,Transformer Network ,Engineering (General). Civil engineering (General) ,TA1-2040 - Abstract
Nowadays, for controlling crime, surveillance cameras are typically installed in all public places to ensure urban safety and security. However, automating Human Activity Recognition (HAR) using computer vision techniques faces several challenges such as lowlighting, complex spatiotemporal features, clutter backgrounds, and inefficient utilization of surveillance system resources. Existing attempts in HAR designed straightforward networks by analyzing either spatial or motion patterns resulting in limited performance while the dual streams methods are entirely based on Convolutional Neural Networks (CNN) that are inadequate to learning the long-range temporal information for HAR. To overcome the above-mentioned challenges, this paper proposes an optimized dual stream framework for HAR which mainly consists of three steps. First, a shots segmentation module is introduced in the proposed framework to efficiently utilize the surveillance system resources by enhancing the lowlight video stream and then it detects salient video frames that consist of human. This module is trained on our own challenging Lowlight Human Surveillance Dataset (LHSD) which consists of both normal and different levels of lowlighting data to recognize humans in complex uncertain environments. Next, to learn HAR from both contextual and motion information, a dual stream approach is used in the feature extraction. In the first stream, it freezes the learned weights of the backbone Vision Transformer (ViT) B-16 model to select the discriminative contextual information. In the second stream, ViT features are then fused with the intermediate encoder layers of FlowNet2 model for optical flow to extract a robust motion feature vector. Finally, a two stream Parallel Bidirectional Long Short-Term Memory (PBiLSTM) is proposed for sequence learning to capture the global semantics of activities, followed by Dual Stream Multi-Head Attention (DSMHA) with a late fusion strategy to optimize the huge features vector for accurate HAR. To assess the strength of the proposed framework, extensive empirical results are conducted on real-world surveillance scenarios and various benchmark HAR datasets that achieve 78.6285%, 96.0151%, and 98.875% accuracies on HMDB51, UCF101, and YouTube Action, respectively. Our results show that the proposed strategy outperforms State-of-the-Art (SOTA) methods. The proposed framework gives superior performance in HAR, providing accurate and reliable recognition of human activities in surveillance systems.
- Published
- 2024
- Full Text
- View/download PDF
48. Spatio-Temporal Feature Aware Vision Transformers for Real-Time Unmanned Aerial Vehicle Tracking
- Author
-
Hao Zhang, Hengzhou Ye, Xiaoyu Guo, Xu Zhang, Yao Rong, and Shuiwang Li
- Subjects
UAV tracking ,temporal relationships ,spatial neighborhood feature extraction ,transformer network ,real-time tracking ,Motor vehicles. Aeronautics. Astronautics ,TL1-4050 - Abstract
Driven by the rapid advancement of Unmanned Aerial Vehicle (UAV) technology, the field of UAV object tracking has witnessed significant progress. This study introduces an innovative single-stream UAV tracking architecture, dubbed NT-Track, which is dedicated to enhancing the efficiency and accuracy of real-time tracking tasks. Addressing the shortcomings of existing tracking systems in capturing temporal relationships between consecutive frames, NT-Track meticulously analyzes the positional changes in targets across frames and leverages the similarity of the surrounding areas to extract feature information. Furthermore, our method integrates spatial and temporal information seamlessly into a unified framework through the introduction of a temporal feature fusion technique, thereby bolstering the overall performance of the model. NT-Track also incorporates a spatial neighborhood feature extraction module, which focuses on identifying and extracting features within the neighborhood of the target in each frame, ensuring continuous focus on the target during inter-frame processing. By employing an improved Transformer backbone network, our approach effectively integrates spatio-temporal information, enhancing the accuracy and robustness of tracking. Our experimental results on several challenging benchmark datasets demonstrate that NT-Track surpasses existing lightweight and deep learning trackers in terms of precision and success rate. It is noteworthy that, on the VisDrone2018 benchmark, NT-Track achieved a precision rate of 90% for the first time, an accomplishment that not only showcases its exceptional performance in complex environments, but also confirms its potential and effectiveness in practical applications.
- Published
- 2025
- Full Text
- View/download PDF
49. TiDEFormer—a heterogenous stacking ensemble approach for time series forecasting of COVID-19 prevalence
- Author
-
Prakash, Satya, Jalal, Anand Singh, and Pathak, Pooja
- Published
- 2024
- Full Text
- View/download PDF
50. Development of optimized cascaded LSTM with Seq2seqNet and transformer net for aspect-based sentiment analysis framework.
- Author
-
Ramasamy, Mekala and Elangovan, Mohanraj
- Abstract
The recent development of communication technologies made it possible for people to share opinions on various social media platforms. The opinion of the people is converted into small-sized textual data. Aspect Based Sentiment Analysis (ABSA) is a process used by businesses and other organizations to assess these textual data in order to comprehend people’s opinions about the services or products offered by them. The majority of earlier Sentiment Analysis (SA) research uses lexicons, word frequencies, or black box techniques to obtain the sentiment in the text. It should be highlighted that these methods disregard the relationships and interdependence between words in terms of semantics. Hence, an efficient ABSA framework to determine the sentiment from the textual reviews of the customers is developed in this work. Initially, the raw text review data is collected from the standard benchmark datasets. The gathered text reviews undergo text pre-processing to neglect the unwanted words and characters from the input text document. The pre-processed data is directly provided to the feature extraction phase in which the seq2seq network and transformer network are employed. Further, the optimal features from the two resultant features are chosen by utilizing the proposed Modified Bird Swarm-Ladybug Beetle Optimization (MBS-LBO). After obtaining optimal features, these features are fused together and given to the final detection model. Consequently, the Optimized Cascaded Long Short Term Memory (OCas-LSTM) is proposed for predicting the sentiments from the given review by the users. Here, the parameters are tuned optimally by the MBS-LBO algorithm, and also it is utilized for enhancing the performance rate. The experimental evaluation is made to reveal the excellent performance of the developed SA model by contrasting it with conventional models. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.