Descriptor: "Transformer Network" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Transformer Network"' showing total 280 results

Start Over Descriptor "Transformer Network"

280 results on '"Transformer Network"'

1. Multi-task transformer network for subject-independent iEEG seizure detection

Author: Sun, Yulin, Cheng, Longlong, Si, Xiaopeng, He, Runnan, Pereira, Tania, Pang, Meijun, Zhang, Kuo, Song, Xin, Ming, Dong, and Liu, Xiuyun
Published: 2025
Full Text: View/download PDF

2. Remaining useful life prediction for stratospheric airships based on a channel and temporal attention network

Author: Luo, Yuzhao, Zhu, Ming, Chen, Tian, and Zheng, Zewei
Published: 2025
Full Text: View/download PDF

3. Deep probabilistic solar power forecasting with Transformer and Gaussian process approximation

Author: Xiong, Binyu, Chen, Yuntian, Chen, Dali, Fu, Jun, and Zhang, Dongxiao
Published: 2025
Full Text: View/download PDF

4. Masked facial expression recognition based on temporal overlap module and action unit graph convolutional network

Author: Zhang, Zheyuan, Liu, Bingtong, Zhou, Ju, Wang, Hanpu, Liu, Xinyu, Lin, Bing, and Chen, Tong
Published: 2025
Full Text: View/download PDF

5. Contextual visual and motion salient fusion framework for action recognition in dark environments

Author: Munsif, Muhammad, Khan, Samee Ullah, Khan, Noman, Hussain, Altaf, Kim, Min Je, and Baik, Sung Wook
Published: 2024
Full Text: View/download PDF

6. Detecting severity of Diabetic Retinopathy from fundus images: A transformer network-based review

Author: Karkera, Tejas, Adak, Chandranath, Chattopadhyay, Soumi, and Saqib, Muhammad
Published: 2024
Full Text: View/download PDF

7. RGB-T-UV Multi-modal Object Tracking Based on Transformer Network

Author: Song, Qinghua, Wang, Xiaolei, Zhang, Yi, Hu, Jinping, Liu, Yu, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Bebis, George, editor, Patel, Vishal, editor, Gu, Jinwei, editor, Panetta, Julian, editor, Gingold, Yotam, editor, Johnsen, Kyle, editor, Arefin, Mohammed Safayet, editor, Dutta, Soumya, editor, and Biswas, Ayan, editor
Published: 2025
Full Text: View/download PDF

8. A Novel CNN-Transformer Capacity Estimation Model for Real-World Lithium-Ion Battery Pack

Author: Soo, Yin-Yi, Wang, Yujie, Xiang, Haoxiang, Li, Gang, Series Editor, Filipe, Joaquim, Series Editor, Xu, Zhiwei, Series Editor, Li, Kang, editor, Liu, Kailong, editor, Hu, Yukun, editor, Tan, Mao, editor, Zhang, Long, editor, and Yang, Zhile, editor
Published: 2025
Full Text: View/download PDF

9. Identification of strong motion record baseline drift based on Bayesian-optimized Transformer network.

Author: Zhou, Baofeng, Yin, Yue, Wang, Maofa, Zhang, Runjie, Zhang, Yue, and Guo, Wenheng
Subjects: *CHI-chi Earthquake, Taiwan, 1999, *TRANSFORMER models, *FEATURE extraction, *EMERGENCY management, *ARTIFICIAL intelligence, *NATURAL disaster warning systems
Abstract: Research in earthquake engineering heavily relies on strong motion observation. The quality of strong motion records directly affects the reliability of earthquake disaster prevention, rapid reporting of seismic magnitude, earthquake early warning, and other areas. Currently, basic mathematical methods, such as zero-line adjustment and filtering, are commonly employed to ensure the quality of strong motion records. However, these methods often rely on subjective judgment based on human experience when dealing with abnormal waveforms in strong motion records, leading to relatively low efficiency. To address this challenge, this paper proposes an innovative Transformer model based on Bayesian optimization to efficiently identify baseline drift anomalies in strong motion records. By partitioning the strong motion record data from the 1999 Chi-Chi earthquake in Taiwan, China, into two categories: high-quality records (with minimal baseline drift) and low-quality records (with significant baseline drift), we extracted data with distinct features and inputted them into the proposed model for training. Data with distinct features were extracted and input into the proposed model for training. Finally, the model was used to predict whether strong motion records exhibited baseline drift abnormalities. The experimental results show that the optimized Transformer model achieves a performance exceeding 85% in key evaluation metrics such as accuracy and F1 scores. It is capable of efficiently identifying a substantial volume of strong motion records with baseline drift within a short period of time. The model effectively performs the baseline drift classification task for strong motion records and can be used for subsequent identification of abnormalities after baseline drift correction, enabling automation in handling abnormal data related to baseline drift. [ABSTRACT FROM AUTHOR]
Published: 2025
Full Text: View/download PDF

10. Character-level inclusive transformer architecture for information gain in low resource code-mixed language.

Author: Bhowmick, Rajat Subhra, Ganguli, Isha, and Sil, Jaya
Subjects: *LOW-resource languages, *TRANSFORMER models, *PROGRAMMING languages, *DEEP learning, *LEARNING strategies
Abstract: The use of code-mixed languages in social media platforms is very common to communicate in an informal way and has immense importance in a multilingual society, like India. Implementing various NLP tasks on code-mixed language for machine comprehension and NLP applications is the need of the hour. The implementation of complex learning models is difficult due to the scarcity of available code-mixed resources. Designing more effective architectures to perform learning from low resource dataset along with transfer learning settings are the possible solutions. We propose an improvised transformer network (Character Inclusion Transformer) that utilizes and learns character-level information available in the words of code-mixed sentences. The proposed model improves the performance of the transformer model when trained from scratch using low resource code-mixed datasets. We also propose two more architecture settings, useful for transfer learning strategy using the mBERT pre-trained model. Three basic word-level tagging NLP tasks, i.e., NER, POS Tagging, and Language Identification (LID) are considered in the paper where Language Identification is specific to code-mixed language. Six separate datasets, namely IIITH NER, LID FIRE, LID ICON, LID UD, POS ICON, POS UD, have been tested, and results are reported using weighted and macro-average while evaluating precision, recall and F1 score [ABSTRACT FROM AUTHOR]
Published: 2025
Full Text: View/download PDF

11. Enhanced bearing RUL prediction based on dynamic temporal attention and mixed MLP.

Author: Jin, Zhongtian, Chen, Chong, Syntetos, Aris, and Liu, Ying
Subjects: REMAINING useful life, MACHINE learning, ARTIFICIAL intelligence, ROLLER bearings, IMAGE processing, DEEP learning
Abstract: Bearings are critical components in machinery, and accurately predicting their remaining useful life (RUL) is essential for effective predictive maintenance. Traditional RUL prediction methods often rely on manual feature extraction and expert knowledge, which face specific challenges such as handling non-stationary data and avoiding overfitting due to the inclusion of numerous irrelevant features. This paper presents an approach that leverages Continuous Wavelet Transform (CWT) for feature extraction, a Channel-Temporal Mixed MLP (CT-MLP) layer for capturing intricate dependencies, and a dynamic attention mechanism to adjust its focus based on the temporal importance of features within the time series. The dynamic attention mechanism integrates multi-head attention with innovative enhancements, making it particularly effective for datasets exhibiting non-stationary behaviour. An experimental study using the XJTU-SY rolling bearings dataset and the PRONOSTIA bearing dataset revealed that the proposed deep learning algorithm significantly outperforms other state-of-the-art algorithms in terms of RMSE and MAE, demonstrating its robustness and accuracy. [ABSTRACT FROM AUTHOR]
Published: 2025
Full Text: View/download PDF

12. Deep learning-based minute-scale digital prediction model for temperature induced deflection of a multi-tower double-layer steel truss bridge.

Author: Meng, Lingxin, Sun, Bo, Dang, Yingjie, Shen, Lizhong, and Zhuang, Yizhou
Subjects: *LONG short-term memory, *TRUSS bridges, *TRANSFORMER models, *PREDICTION models, *WAVELET transforms, *LONG-span bridges
Abstract: Bridge deflection serves as a vital and intuitive index for the evaluation of bridge safety. Temperature load has the greatest influence on the bridge deformation and studies on the temperature-induced deformation prediction of long-span bridge are in limited numbers. A digital prediction model based on deep learning in minute scale is established to study the bridge deflection caused by temperature. The wavelet transform (WT) is adopted to filter the high-frequency signals of the original deflection caused by the related load factors. Three different networks, long short-term memory (LSTM), bidirectional LSTM (Bi-LSTM), and Transformer variant, are studied and compared in the prediction process. Two different learning strategies considering different input data are also considered to optimize the prediction performance. The proposed prediction model is applied to the temperature induced deflection prediction of a multi-tower double-layer steel truss bridge. The results show that strategy A, which employs temperature time series data as input, is less effective than strategy B. Incorporating both temperature and deflection data as inputs is essential for predicting temperature-induced deflections. Moreover, the Transformer-variant network generally exhibits superior prediction performance compared to the LSTM and Bi-LSTM. The self-attention mechanism of the Transformer allows it to focus on key historical temperature points, thereby enhancing prediction accuracy. [ABSTRACT FROM AUTHOR]
Published: 2025
Full Text: View/download PDF

13. 融合快速边缘注意力的Transformer 跟踪算法.

Author: 薛紫涵, 葛海波, 王淑贤, 安玉, and 杨雨迪
Subjects: TRACKING algorithms, FEATURE extraction, PROBLEM solving, MULTILAYER perceptrons
Abstract: Copyright of Journal of Computer Engineering & Applications is the property of Beijing Journal of Computer Engineering & Applications Journal Co Ltd. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Published: 2025
Full Text: View/download PDF

14. An Intelligent Maneuver Decision-Making Approach for Air Combat Based on Deep Reinforcement Learning and Transformer Networks.

Author: Li, Wentao, Fang, Feng, Peng, Dongliang, and Han, Shuning
Subjects: *DEEP reinforcement learning, *PROBLEM solving, *TIME series analysis, *DECISION making, *SAMPLING methods
Abstract: The traditional maneuver decision-making approaches are highly dependent on accurate and complete situation information, and their decision-making quality becomes poor when opponent information is occasionally missing in complex electromagnetic environments. In order to solve this problem, an autonomous maneuver decision-making approach is developed based on deep reinforcement learning (DRL) architecture. Meanwhile, a Transformer network is integrated into the actor and critic networks, which can find the potential dependency relationships among the time series trajectory data. By using these relationships, the information loss is partially compensated, which leads to maneuvering decisions being more accurate. The issues of limited experience samples, low sampling efficiency, and poor stability in the agent training state appear when the Transformer network is introduced into DRL. To address these issues, the measures of designing an effective decision-making reward, a prioritized sampling method, and a dynamic learning rate adjustment mechanism are proposed. Numerous simulation results show that the proposed approach outperforms the traditional DRL algorithms, with a higher win rate in the case of opponent information loss. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

15. Enhanced bearing RUL prediction based on dynamic temporal attention and mixed MLP

Author: Zhongtian Jin, Chong Chen, Aris Syntetos, and Ying Liu
Subjects: Deep learning, Remaining useful life, Prognostic and health management, Transformer network, Electronic computers. Computer science, QA75.5-76.95, Computer engineering. Computer hardware, TK7885-7895
Abstract: Abstract Bearings are critical components in machinery, and accurately predicting their remaining useful life (RUL) is essential for effective predictive maintenance. Traditional RUL prediction methods often rely on manual feature extraction and expert knowledge, which face specific challenges such as handling non-stationary data and avoiding overfitting due to the inclusion of numerous irrelevant features. This paper presents an approach that leverages Continuous Wavelet Transform (CWT) for feature extraction, a Channel-Temporal Mixed MLP (CT-MLP) layer for capturing intricate dependencies, and a dynamic attention mechanism to adjust its focus based on the temporal importance of features within the time series. The dynamic attention mechanism integrates multi-head attention with innovative enhancements, making it particularly effective for datasets exhibiting non-stationary behaviour. An experimental study using the XJTU-SY rolling bearings dataset and the PRONOSTIA bearing dataset revealed that the proposed deep learning algorithm significantly outperforms other state-of-the-art algorithms in terms of RMSE and MAE, demonstrating its robustness and accuracy.
Published: 2025
Full Text: View/download PDF

16. Train wheelset bearing damage identification method based on convolution and transformer fusion framework

Author: Feiyue DENG, Yulong CAI, Rui WANG, and Shouxi ZHENG
Subjects: wheelset bearing, damage identification, convolutional network, transformer network, multi-scale feature, Mining engineering. Metallurgy, TN1-997, Environmental engineering, TA170-171
Abstract: To address the issues of image feature insensitivity, high requirement of expert experience, and low recognition accuracy of traditional machine vision methods in train wheelset bearing damage detection, this paper proposes an identification method based on the framework of convolutional and transformer fusion networks for identifying damage to train wheelset bearings. First, due to the complexity of train-bearing images, their category imbalance is more severe; an image preprocessing method called image enhancement category reorganization is used to improve the quality of the acquired image dataset and eliminate the effects of the imbalance dataset. Second, a convolutional neural network (CNN) has high model construction and training efficiency due to adopting a local sensing field and weight-sharing strategy, which can only sense local neighborhoods but has limited ability to capture global feature information. Transformer is a network model based on a self-attention mechanism. With strong parallel computing ability, it can learn the remote dependencies between image pixels in the global scope and has a more powerful global information extraction ability. However, the ability to mine the local features of the image is not sufficient. Therefore, this paper presents a VGG and transformer parallel fusion network that integrates the global contour features and local details of the image based on the fusion of convolution and self-attention. Furthermore, a multiscale dilation spatial pyramid convolution (MDSPC) module is constructed to fully mine the multiscale semantic features in the feature map using multiscale dilation convolution progressive fusion. The proposed method effectively solves the problem of feature information loss due to the mesh effect caused by the expansion convolution. Additionally, embedding coordinate attention (CA) after the MDSPC module can obtain remote dependencies and more precise positional relationships of feature images from two spatial directions, which can more accurately focus on specific regions in the feature map. Finally, experimental analyses were conducted using the NEU-DET image defect and self-constructed train wheelset bearing image datasets. The experimental results demonstrate that the proposed model has an accuracy of 99.44% and 98% for recognizing six types of defects and four types of images of wheelset bearings in NEU-DET data, respectively. The feature extraction capability of the proposed model was verified using model visualization methods. Compared with existing CNN models, ViT model with self-attention mechanism, and CNN-transformer fusion model, the proposed method shows significantly better evaluation metrics and accurately identifies different types of image samples without significantly increasing the model complexity.
Published: 2024
Full Text: View/download PDF

17. An investigation on energy-saving scheduling algorithm of wireless monitoring sensors in oil and gas pipeline networks

Author: Zhifeng Ma, Zhanjun Hao, and Zhenya Zhao
Subjects: Wireless sensor network, Oil and gas pipeline network, Energy-saving scheduling, Transformer network, Energy efficiency, Energy industries. Energy policy. Fuel trade, HD9502-9502.5
Abstract: Abstract With the rapid development of the oil and gas industry, monitoring the safety and efficiency of pipeline networks has become particularly important. In this context, Wireless Sensor Networks (WSNs) are widely used for monitoring oil and gas pipelines due to their flexible deployment and cost-effectiveness. However, since sensor nodes typically rely on limited battery power, extending the network’s lifecycle and improving energy utilization efficiency have become focal points of research. Therefore, this paper proposes an energy-saving scheduling algorithm based on transformer networks, aimed at optimizing energy consumption and data transmission efficiency of wireless monitoring sensors in oil and gas pipelines. Firstly, this study designs a deep learning-based Transformer model that learns from historical data on energy consumption patterns and environmental variables to predict the energy and data transmission needs of each sensor node. Secondly, based on the prediction results, this algorithm employs a dynamic scheduling strategy that automatically adjusts the sensor’s operational mode and communication frequency according to the node’s energy status and task urgency. Additionally, we have validated the effectiveness of the proposed algorithm through field tests and simulation experiments. According to the experimental results, our model has higher efficiency in energy saving. Compared with Convolutional Neural Networks, Recurrent Neural Networks and Graph Neural Networks, the total energy consumption of sensor networks under the model scheduling in this paper was reduced by 6.7%, 33.4% and 26.3%, respectively. Our algorithms improve the energy efficiency and stability of the monitoring system and provide important technical support for future intelligent pipeline monitoring systems. We hope this paper will inspire future scientific research in this field.
Published: 2024
Full Text: View/download PDF

18. 基于Transformer 的盾构泥水舱液位智能预测与控制.

Author: 卢靖, 李刚, 胡珉, 王伊, and 刘玲玲
Abstract: Copyright of Tunnel Construction / Suidao Jianshe (Zhong-Yingwen Ban) is the property of Tunnel Construction Editorial Office and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Published: 2024
Full Text: View/download PDF

19. 认知传感网中基于 Transformer 网络的 MAC 协议识别方法.

Author: 赵　立, 赵宏坚, 高智伟, 王黎明, 刘　越, 罗　渝, and 廖　勇
Subjects: TRANSFORMER models, SPECTRUM allocation, COGNITIVE radio, SENSOR networks, TELECOMMUNICATION systems, DEEP learning, FEEDFORWARD neural networks
Abstract: Copyright of Telecommunication Engineering is the property of Telecommunication Engineering and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Published: 2024
Full Text: View/download PDF

20. An investigation on energy-saving scheduling algorithm of wireless monitoring sensors in oil and gas pipeline networks.

Author: Ma, Zhifeng, Hao, Zhanjun, and Zhao, Zhenya
Subjects: GRAPH neural networks, TRANSFORMER models, WIRELESS sensor networks, CONVOLUTIONAL neural networks, SENSOR networks, DEEP learning
Abstract: With the rapid development of the oil and gas industry, monitoring the safety and efficiency of pipeline networks has become particularly important. In this context, Wireless Sensor Networks (WSNs) are widely used for monitoring oil and gas pipelines due to their flexible deployment and cost-effectiveness. However, since sensor nodes typically rely on limited battery power, extending the network's lifecycle and improving energy utilization efficiency have become focal points of research. Therefore, this paper proposes an energy-saving scheduling algorithm based on transformer networks, aimed at optimizing energy consumption and data transmission efficiency of wireless monitoring sensors in oil and gas pipelines. Firstly, this study designs a deep learning-based Transformer model that learns from historical data on energy consumption patterns and environmental variables to predict the energy and data transmission needs of each sensor node. Secondly, based on the prediction results, this algorithm employs a dynamic scheduling strategy that automatically adjusts the sensor's operational mode and communication frequency according to the node's energy status and task urgency. Additionally, we have validated the effectiveness of the proposed algorithm through field tests and simulation experiments. According to the experimental results, our model has higher efficiency in energy saving. Compared with Convolutional Neural Networks, Recurrent Neural Networks and Graph Neural Networks, the total energy consumption of sensor networks under the model scheduling in this paper was reduced by 6.7%, 33.4% and 26.3%, respectively. Our algorithms improve the energy efficiency and stability of the monitoring system and provide important technical support for future intelligent pipeline monitoring systems. We hope this paper will inspire future scientific research in this field. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

21. Exploring the Capability of Kernel- and Correlation-Based Learning on PCB Component Segmentation.

Author: Al Hasan, Md Mahfuz, Varshney, Nitin, Jessurun, Nathan, Forghani, Reza, and Asadizanjani, Navid
Subjects: *ARTIFICIAL neural networks, *IMAGE segmentation, *TRANSFORMER models, *DEEP learning, *PRINTED circuits, *PRINTED circuit design
Abstract: Due to the continuous increase in the globalized outsourcing of printed circuit board (PCB) fabrication, PCB counterfeits have increased by a significant margin, necessitating rapid and advanced hardware assurance techniques. PCB image segmentation is the primary step in PCB assurance. Over the years, few PCB component segmentation methods have been proposed, and none of those has provided a definite performance benchmark. Besides, those methods have not discussed how the performance is correlated with underlying data or annotation quality. This work presents a PCB image segmentation benchmark. In addition, we explore how annotation quality affects component segmentation and present possible future research directions to work with coarse annotations to alleviate the human effort behind full data annotation tasks. We have analyzed the performance of the preferred deep neural network (DNN) architecture and Transformer architecture with the data annotation quality and presented the direction to leverage the outcome with limited quality annotations. Finally, we present the qualitative as well as the quantitative results to demonstrate the performance of our techniques and provide observations and future research directions on the overall task. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

22. CTRNet: An Automatic Modulation Recognition Based on Transformer-CNN Neural Network.

Author: Zhang, Wenna, Xue, Kailiang, Yao, Aiqin, and Sun, Yunqiang
Subjects: PATTERN recognition systems, RECURRENT neural networks, CONVOLUTIONAL neural networks, DEEP learning, TRANSFORMER models
Abstract: Deep learning (DL) has brought new perspectives and methods to automatic modulation recognition (AMR), enabling AMR systems to operate more efficiently and reliably in modern wireless communication environments through its powerful feature learning and complex pattern recognition capabilities. However, convolutional neural networks (CNNs) and recurrent neural networks (RNNs), which are used for sequence recognition tasks, face two main challenges, respectively: the ineffective utilization of global information and slow processing speeds due to sequential operations. To address these issues, this paper introduces CTRNet, a novel automatic modulation recognition network that combines a CNN with Transformer. This combination leverages Transformer's ability to adequately capture the long-distance dependencies between global sequences and its advantages in sequence modeling, along with the CNN's capability to extract features from local feature regions of signals. During the data preprocessing stage, the original IQ-modulated signals undergo sliding-window processing. By selecting the appropriate window sizes and strides, multiple subsequences are formed, enabling the network to effectively handle complex modulation patterns. In the embedding module, token vectors are designed to integrate information from multiple samples within each window, enhancing the model's understanding and modeling ability of global information. In the feedforward neural network, a more effective Bilinear layer is employed for processing to capture the higher-order relationship between input features, thereby enhancing the ability of the model to capture complex patterns. Experiments conducted on the RML2016.10A public dataset demonstrate that compared with the existing algorithms, the proposed algorithm not only exhibits significant advantages in terms of parameter efficiency but also achieves higher recognition accuracy under various signal-to-noise ratio (SNR) conditions. In particular, it performs relatively well in terms of accuracy, precision, recall, and F1-score, with clearer classification of higher-order modulations and notable overall accuracy improvement. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

23. Transformer-based correction scheme for short-term bus load prediction in holidays.

Author: Tang Ningkai, Lu Jixiang, Chen Tianyu, Shu Jiao, Chang Li, and Chen Tao
Abstract: Copyright of Journal of Southeast University (English Edition) is the property of Journal of Southeast University Editorial Office and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Published: 2024
Full Text: View/download PDF

24. Data-Driven AI Model for Turbomachinery Compressor Aerodynamics Enabling Rapid Approximation of 3D Flow Solutions.

Author: Aulich, Marcel, Goinis, Georgios, and Voß, Christian
Subjects: ARTIFICIAL neural networks, ARTIFICIAL intelligence, COMPUTATIONAL fluid dynamics, TRANSFORMER models, CURIOSITY
Abstract: The development of new turbomachinery designs requires numerous time-consuming and computationally intensive computational fluid dynamics (CFD) calculations. However, most of the generated high spatial resolution data remain unused at later development steps. That is also the case with automated optimization processes that use only a few integral values to determine objectives and constraints. To make further use of this vast amount of CFD data a data-driven AI model based on the Transformer architecture is developed and trained using the available CFD data. The presented method subsequently provides a fast approximation of the 3D flow for new designs. In this paper, the structure of the developed AI model is presented and the approximation quality is analyzed using a complex, state-of-the-art compressor test case. It is shown that the AI model can reproduce many characteristics of the 3D flow of new designs, and performance measures such as efficiency can be derived from these flow predictions. In addition, the complex test case revealed that greater design variation reduces the AI approximation quality which can lead to undesirable exploratory behavior in an optimization setup. Overall, the test case has shown promising results and has provided hints for further improvements to the AI model. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

25. 改进的密集视频描述 Transformer 译码算法.

Author: 杨大伟, 盘晓芳, 毛琳, and 张汝波
Abstract: Copyright of Journal of Computer Engineering & Applications is the property of Beijing Journal of Computer Engineering & Applications Journal Co Ltd. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Published: 2024
Full Text: View/download PDF

26. Generating Genre-Based Automatic Feedback on English for Research Publication Purposes.

Author: Link, Stephanie, Redmon, Robert, Shamsi, Yaser, and Hagan, Martin
Subjects: NATURAL language processing, LANGUAGE models, COMPUTER assisted language instruction, ARTIFICIAL intelligence, TRANSFORMER models
Abstract: Artificial intelligence (AI) for supporting second language (L2) writing processes and practices has garnered increasing interest in recent years, establishing AI-mediated L2 writing as a new norm for many multilingual classrooms. As such, the emergence of AI-mediated technologies has challenged L2 writing instructors and their philosophies regarding computer-assisted language learning (CALL) and teaching. Technologies that can combine principled pedagogical practices and the benefits of AI can help to change the landscape of L2 writing instruction while maintaining the integrity of knowledge production that is so important to CALL instructors. To align L2 instructional practices and CALL technologies, we discuss the development of an AI-mediated L2 writing technology that leverages genre-based instruction (GBI) and large language models to provide L2 writers and instructors with tools to enhance English for research publication purposes. Our work reports on the accuracy, precision, and recall of our network classification, which surpass previously reported research in the field of genre-based automated writing evaluation by offering a faster network training approach with higher accuracy of feedback provision and new beginnings for genre-based learning systems. Implications for tool development and GBI are discussed. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

27. Well googled is half done: Multimodal forecasting of new fashion product sales with image‐based google trends.

Author: Skenderi, Geri, Joppi, Christian, Denitto, Matteo, and Cristani, Marco
Subjects: PRODUCT image, NEW product development, TIME series analysis, FORECASTING, METADATA, SALES forecasting
Abstract: New fashion product sales forecasting is a challenging problem that involves many business dynamics and cannot be solved by classical forecasting approaches. In this paper, we investigate the effectiveness of systematically probing exogenous knowledge in the form of Google Trends time series and combining it with multi‐modal information related to a brand‐new fashion item, in order to effectively forecast its sales despite the lack of past data. In particular, we propose a neural network‐based approach, where an encoder learns a representation of the exogenous time series, while the decoder forecasts the sales based on the Google Trends encoding and the available visual and metadata information. Our model works in a non‐autoregressive manner, avoiding the compounding effect of large first‐step errors. As a second contribution, we present VISUELLE, a publicly available dataset for the task of new fashion product sales forecasting, containing multimodal information for 5,577 real, new products sold between 2016 and 2019 from Nunalie, an Italian fast‐fashion company. The dataset is equipped with images of products, metadata, related sales, and associated Google Trends. We use VISUELLE to compare our approach against state‐of‐the‐art alternatives and several baselines, showing that our neural network‐based approach is the most accurate in terms of both percentage and absolute error. It is worth noting that the addition of exogenous knowledge boosts the forecasting accuracy by 1.5% in terms of Weighted Absolute Percentage Error (WAPE), revealing the importance of exploiting informative external information. The code and dataset are both available online (at https://github.com/HumaticsLAB/GTM-Transformer). [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

28. Transformer-style convolution network for lightweight image super-resolution

Author: Gendy, Garas and Sabor, Nabil
Published: 2025
Full Text: View/download PDF

29. Automatic Lip Reading of Persian Words by a Robotic System Using Deep Learning Algorithms

Author: Gholipour, Amir, Mohammadzade, Hoda, Ghadami, Ali, and Taheri, Alireza
Published: 2024
Full Text: View/download PDF

30. Autism spectrum disorders detection based on multi-task transformer neural network

Author: Le Gao, Zhimin Wang, Yun Long, Xin Zhang, Hexing Su, Yong Yu, and Jin Hong
Subjects: Autism Spectrum Disorders, Artificial intelligence, Biological information, Multi-task learning, Transformer network, Neurosciences. Biological psychiatry. Neuropsychiatry, RC321-571, Neurophysiology and neuropsychology, QP351-495
Abstract: Abstract Autism Spectrum Disorders (ASD) are neurodevelopmental disorders that cause people difficulties in social interaction and communication. Identifying ASD patients based on resting-state functional magnetic resonance imaging (rs-fMRI) data is a promising diagnostic tool, but challenging due to the complex and unclear etiology of autism. And it is difficult to effectively identify ASD patients with a single data source (single task). Therefore, to address this challenge, we propose a novel multi-task learning framework for ASD identification based on rs-fMRI data, which can leverage useful information from multiple related tasks to improve the generalization performance of the model. Meanwhile, we adopt an attention mechanism to extract ASD-related features from each rs-fMRI dataset, which can enhance the feature representation and interpretability of the model. The results show that our method outperforms state-of-the-art methods in terms of accuracy, sensitivity and specificity. This work provides a new perspective and solution for ASD identification based on rs-fMRI data using multi-task learning. It also demonstrates the potential and value of machine learning for advancing neuroscience research and clinical practice.
Published: 2024
Full Text: View/download PDF

31. Phishing Webpage Detection via Multi-Modal Integration of HTML DOM Graphs and URL Features Based on Graph Convolutional and Transformer Networks.

Author: Yoon, Jun-Ho, Buu, Seok-Jun, and Kim, Hae-Jung
Subjects: CONVOLUTIONAL neural networks, DEEP learning, TRANSFORMER models, INTERNET safety, PHISHING
Abstract: Detecting phishing webpages is a critical task in the field of cybersecurity, with significant implications for online safety and data protection. Traditional methods have primarily relied on analyzing URL features, which can be limited in capturing the full context of phishing attacks. In this study, we propose an innovative approach that integrates HTML DOM graph modeling with URL feature analysis using advanced deep learning techniques. The proposed method leverages Graph Convolutional Networks (GCNs) to model the structure of HTML DOM graphs, combined with Convolutional Neural Networks (CNNs) and Transformer Networks to capture the character and word sequence features of URLs, respectively. These multi-modal features are then integrated using a Transformer network, which is adept at selectively capturing the interdependencies and complementary relationships between different feature sets. We evaluated our approach on a real-world dataset comprising URL and HTML DOM graph data collected from 2012 to 2024. This dataset includes over 80 million nodes and edges, providing a robust foundation for testing. Our method demonstrated a significant improvement in performance, achieving a 7.03 percentage point increase in classification accuracy compared to state-of-the-art techniques. Additionally, we conducted ablation tests to further validate the effectiveness of individual features in our model. The results validate the efficacy of integrating HTML DOM structure and URL features using deep learning. Our framework significantly enhances phishing detection capabilities, providing a more accurate and comprehensive solution to identifying malicious webpages. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

32. 基于Transformer网络特征融合的色纺织物图像检索.

Author: 沈佳忱, 袁理, 廖海斌, 王闵, and 郭旻
Subjects: CONVOLUTIONAL neural networks, ARTIFICIAL neural networks, TRANSFORMER models, IMAGE retrieval, IMAGE fusion
Abstract: Copyright of Wool Textile Journal is the property of National Wool Textile Science & Technology Information Center and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Published: 2024
Full Text: View/download PDF

33. DATFNets-dynamic adaptive assigned transformer network for fire detection.

Author: Wang, Zuoxin, Zhao, Xiaohu, and Li, Dunqing
Subjects: CONVOLUTIONAL neural networks, FIRE management, FIRE prevention, JOB classification
Abstract: Fires cause severe damage to the ecological environment and threaten human life and property. Although the traditional convolutional neural network method effectively detects large-area fires, it cannot capture small fires in complex areas through a limited receptive field. At the same time, fires can change at any time due to the influence of wind direction, which challenges fire prevention and control personnel. To solve these problems, a novel dynamic adaptive distribution transformer detection framework is proposed to help firefighters and researchers develop optimal fire management strategies. On the one hand, this framework embeds a context aggregation layer with a masking strategy in the feature extractor to improve the representation of low-level and salient features. The masking strategy can reduce irrelevant information and improve network generalization. On the other hand, designed a dynamic adaptive direction conversion function and sample allocation strategy to fully use adaptive point representation while achieving accurate positioning and classification of fires and screening out representative fire samples in complex backgrounds. In addition, to prevent the network from being limited to the local optimum and discrete points in the sample from causing severe interference to the overall performance, designed a weighted loss function with spatial constraints to optimize the network and penalize the discrete points in the sample. The mAP in the three baseline data sets of FireDets, WildFurgFires, and FireAndSmokes are 0.871, 0.909, and 0.955, respectively. The experimental results are significantly better than other detection methods, which proves that the proposed method has good robustness and detection performance. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

34. Power Quality Transient Disturbance Diagnosis Based on Dynamic Large Convolution Kernel and Multi-Level Feature Fusion Network.

Author: Zheng, Chen, Li, Qionglin, Liu, Shuming, Dai, Shuangyin, Zhang, Bo, and Liu, Yajuan
Subjects: *POWER quality disturbances, *ELECTRIC transients, *CONVOLUTIONAL neural networks, *TRANSFORMER models, *FEATURE extraction, *KERNEL (Mathematics)
Abstract: Power quality is an important metric for the normal operation of a power system, and the accurate identification of transient signals is of great significance for the improvement of power quality. The diverse types of power system transient signals and strong characteristic coupling brings new challenges to the analysis and identification of power system transient signals. In order to enhance the identification accuracy of transient signals, one method of power system transient signal identification is proposed based on a dynamic large convolution kernel and multilevel feature fusion network. First, the more fine-grained and more informative features of the transient signals are extracted by the dynamic large convolution kernel feature extraction module. Then, the multi-scale local features are adaptively fused by the multilevel feature fusion module. Finally, the fused features are reduced in dimension by the fully connected layer in the classification module and fed into the SoftMax layer for transient signal type detection. The proposed method can effectively improve the small receptive field problem of convolutional neural networks and the lack of ability of Transformer network in extracting local context information. Compared with five other power quality transient disturbance identification models, the experimental results show that the proposed method has better diagnostic accuracy and anti-noise capability. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

35. Color characterization model of colored fabric based on Transformer network feature fusion.

Author: WU Xinru, YUAN Li, WANG Win, GUO Min, ZHU Lanyan, and WANG Jing
Subjects: FEATURE extraction, MULTISENSOR data fusion, STATISTICAL correlation
Abstract: To address the unique color structure of colored fabric and dge limitations of commonly used single color measurdlnent toollr; a multi-source heterogeneous data fusion color representation model was established based on Transformer network. Spectral and texture features was extracted from spwtrometer data and image data respectively, Transformer network was used to fuse multi-source heterogeneous data features, their complementary characteristics was fully utilized, and the color information of colored fabric was effectively and comprehensively characterized. The results indicate that the color characterization model constructed in this article can effectively characterize the differences in the quality ratio ^ dyed fibers within a large range of changes and the color changes due to uneven distribution on the surface of dyed fibers. At the same time, under the measurement aperture of 6mm, 10mm, and 25mm, the correlation coefficients between the fusion feature difference and the fiber ratio difference are all higher than 85%. Compared with a single spectral feature and a single image feature, the correlation coefficient of this method is improved by more than 10%, which has ideal robustness. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

36. Autism spectrum disorders detection based on multi-task transformer neural network.

Author: Gao, Le, Wang, Zhimin, Long, Yun, Zhang, Xin, Su, Hexing, Yu, Yong, and Hong, Jin
Subjects: AUTISM spectrum disorders, TRANSFORMER models, FUNCTIONAL magnetic resonance imaging, MACHINE learning, CLINICAL neurosciences
Abstract: Autism Spectrum Disorders (ASD) are neurodevelopmental disorders that cause people difficulties in social interaction and communication. Identifying ASD patients based on resting-state functional magnetic resonance imaging (rs-fMRI) data is a promising diagnostic tool, but challenging due to the complex and unclear etiology of autism. And it is difficult to effectively identify ASD patients with a single data source (single task). Therefore, to address this challenge, we propose a novel multi-task learning framework for ASD identification based on rs-fMRI data, which can leverage useful information from multiple related tasks to improve the generalization performance of the model. Meanwhile, we adopt an attention mechanism to extract ASD-related features from each rs-fMRI dataset, which can enhance the feature representation and interpretability of the model. The results show that our method outperforms state-of-the-art methods in terms of accuracy, sensitivity and specificity. This work provides a new perspective and solution for ASD identification based on rs-fMRI data using multi-task learning. It also demonstrates the potential and value of machine learning for advancing neuroscience research and clinical practice. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

37. Activated Sparsely Sub-Pixel Transformer for Remote Sensing Image Super-Resolution.

Author: Guo, Yongde, Gong, Chengying, and Yan, Jun
Subjects: *TRANSFORMER models, *HIGH resolution imaging, *FEATURE selection, *CODING theory
Abstract: Transformers have recently achieved significant breakthroughs in various visual tasks. However, these methods often overlook the optimization of interactions between convolution and transformer blocks. Although the basic attention module strengthens the feature selection ability, it is still weak in generating superior quality output. In order to address this challenge, we propose the integration of sub-pixel space and the application of sparse coding theory in the calculation of self-attention. This approach aims to enhance the network's generation capability, leading to the development of a sparse-activated sub-pixel transformer network (SSTNet). The experimental results show that compared with several state-of-the-art methods, our proposed network can obtain better generation results, improving the sharpness of object edges and the richness of detail texture information in super-resolution generated images. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

38. Enhancing Automatic Modulation Recognition for IoT Applications Using Transformers.

Author: Rashvand, Narges, Witham, Kenneth, Maldonado, Gabriel, Katariya, Vinit, Marer Prabhu, Nishanth, Schirner, Gunar, and Tabkhi, Hamed
Subjects: INTERNET of things, TRANSFORMER models, NATURAL language processing, EDGE computing, DEEP learning
Abstract: Automatic modulation recognition (AMR) is vital for accurately identifying modulation types within incoming signals, a critical task for optimizing operations within edge devices in IoT ecosystems. This paper presents an innovative approach that leverages Transformer networks, initially designed for natural language processing, to address the challenges of efficient AMR. Our Transformer network architecture is designed with the mindset of real-time edge computing on IoT devices. Four tokenization techniques are proposed and explored for creating proper embeddings of RF signals, specifically focusing on overcoming the limitations related to the model size often encountered in IoT scenarios. Extensive experiments reveal that our proposed method outperformed advanced deep learning techniques, achieving the highest recognition accuracy. Notably, our model achieved an accuracy of 65.75 on the RML2016 and 65.80 on the CSPB.ML.2018+ dataset. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

39. Energy Consumption Prediction Method for Refrigeration Systems Based on Adversarial Networks and Transformer Networks

Author: Zhang, Hu, Liu, Huifeng, Zhang, Youli, Guo, Ying, Dai, Hongjun, Shao, Minghao, Xu, Hongyu, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Cao, Cungeng, editor, Chen, Huajun, editor, Zhao, Liang, editor, Arshad, Junaid, editor, Asyhari, Taufiq, editor, and Wang, Yonghao, editor
Published: 2024
Full Text: View/download PDF

40. Transformation Network Model for Ear Recognition

Author: Booysens, Aimee, Viriri, Serestina, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, van Leeuwen, Jan, Series Editor, Hutchison, David, Editorial Board Member, Kanade, Takeo, Editorial Board Member, Kittler, Josef, Editorial Board Member, Kleinberg, Jon M., Editorial Board Member, Kobsa, Alfred, Series Editor, Mattern, Friedemann, Editorial Board Member, Mitchell, John C., Editorial Board Member, Naor, Moni, Editorial Board Member, Nierstrasz, Oscar, Series Editor, Pandu Rangan, C., Editorial Board Member, Sudan, Madhu, Series Editor, Terzopoulos, Demetri, Editorial Board Member, Tygar, Doug, Editorial Board Member, Weikum, Gerhard, Series Editor, Vardi, Moshe Y, Series Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Renault, Éric, editor, Boumerdassi, Selma, editor, and Mühlethaler, Paul, editor
Published: 2024
Full Text: View/download PDF

41. Compact Convolutional Transformer for Bearing Remaining Useful Life Prediction

Author: Jin, Zhongtian, Chen, Chong, Liu, Qingtao, Syntetos, Aris, Liu, Ying, Chaari, Fakher, Series Editor, Gherardini, Francesco, Series Editor, Ivanov, Vitalii, Series Editor, Haddar, Mohamed, Series Editor, Cavas-Martínez, Francisco, Editorial Board Member, di Mare, Francesca, Editorial Board Member, Kwon, Young W., Editorial Board Member, Tolio, Tullio A. M., Editorial Board Member, Trojanowska, Justyna, Editorial Board Member, Schmitt, Robert, Editorial Board Member, Xu, Jinyang, Editorial Board Member, Fera, Marcello, editor, Caterino, Mario, editor, Macchiaroli, Roberto, editor, and Pham, Duc Truong, editor
Published: 2024
Full Text: View/download PDF

42. Learning Bottleneck Transformer for Event Image-Voxel Feature Fusion Based Classification

Author: Yuan, Chengguo, Jin, Yu, Wu, Zongzhen, Wei, Fanting, Wang, Yangzirui, Chen, Lan, Wang, Xiao, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Liu, Qingshan, editor, Wang, Hanzi, editor, Ma, Zhanyu, editor, Zheng, Weishi, editor, Zha, Hongbin, editor, Chen, Xilin, editor, Wang, Liang, editor, and Ji, Rongrong, editor
Published: 2024
Full Text: View/download PDF

43. Multi-step prediction of offshore wind power based on Transformer network and Huber loss

Author: Haoyi Xiao, Xiaoxia He, and Chunli Li
Subjects: Offshore wind power prediction, Transformer network, Huber loss function, Autoencoder, Slime mould optimization algorithm, Multi-step prediction, Production of electric energy or power. Powerplants. Central stations, TK1001-1841
Abstract: In the context of the burgeoning expansion of renewable energy sources, the precise prediction of offshore wind power assumes a pivotal role in safeguarding the reliability, economic viability, and sustainable progression of offshore wind farms. The present study introduces a novel methodology for offshore wind power prediction, predicated upon the synergy of the Transformer network and Huber loss function. Empirical validation is conducted utilizing authentic data from a European offshore wind farm. The resulting analyses delineate a discernible superiority of the Transformer network over classical LSTM and GRU models in capturing the intricate long-term dependencies intrinsic to the time series. Furthermore, the inclusion of the Huber loss function effectively mitigates the challenges posed by the high volatility often characteristic of offshore wind power data. The study also demonstrates the beneficial integration of autoencoder reconstruction for denoising and slime mould optimization algorithm to augment prediction performance. Distinctively diverging from traditional single-step prediction paradigms, the multi-step prediction model constructed within this research offers a more comprehensive and precise prediction of wind power. Such an innovative approach represents a valuable contribution to the field, with tangible implications for the dependable operation and future advancement of wind power.
Published: 2024
Full Text: View/download PDF

44. CAT-Net: A Cross-Slice Attention Transformer Model for Prostate Zonal Segmentation in MRI

Author: Hung, Alex Ling Yu, Zheng, Haoxin, Miao, Qi, Raman, Steven S, Terzopoulos, Demetri, and Sung, Kyunghyun
Subjects: Information and Computing Sciences, Biomedical Imaging, Prostate Cancer, Cancer, Urologic Diseases, Aging, Humans, Male, Prostate, Image Processing, Computer-Assisted, Magnetic Resonance Imaging, Prostatic Neoplasms, Pelvis, Image segmentation, Transformers, Three-dimensional displays, Magnetic resonance imaging, Standards, Image resolution, Decoding, Attention mechanism, deep learning, magnetic resonance imaging, prostate zonal segmentation, transformer network, Engineering, Nuclear Medicine & Medical Imaging, Information and computing sciences
Abstract: Prostate cancer is the second leading cause of cancer death among men in the United States. The diagnosis of prostate MRI often relies on accurate prostate zonal segmentation. However, state-of-the-art automatic segmentation methods often fail to produce well-contained volumetric segmentation of the prostate zones since certain slices of prostate MRI, such as base and apex slices, are harder to segment than other slices. This difficulty can be overcome by leveraging important multi-scale image-based information from adjacent slices, but current methods do not fully learn and exploit such cross-slice information. In this paper, we propose a novel cross-slice attention mechanism, which we use in a Transformer module to systematically learn cross-slice information at multiple scales. The module can be utilized in any existing deep-learning-based segmentation framework with skip connections. Experiments show that our cross-slice attention is able to capture cross-slice information significant for prostate zonal segmentation in order to improve the performance of current state-of-the-art methods. Cross-slice attention improves segmentation accuracy in the peripheral zones, such that segmentation results are consistent across all the prostate slices (apex, mid-gland, and base). The code for the proposed model is available at https://bit.ly/CAT-Net.
Published: 2023

45. DATFNets-dynamic adaptive assigned transformer network for fire detection

Author: Zuoxin Wang, Xiaohu Zhao, and Dunqing Li
Subjects: Fire detection, Fire management, Transformer network, Global contextual semantics, Feature extractor, Spatial constraints, Electronic computers. Computer science, QA75.5-76.95, Information technology, T58.5-58.64
Abstract: Abstract Fires cause severe damage to the ecological environment and threaten human life and property. Although the traditional convolutional neural network method effectively detects large-area fires, it cannot capture small fires in complex areas through a limited receptive field. At the same time, fires can change at any time due to the influence of wind direction, which challenges fire prevention and control personnel. To solve these problems, a novel dynamic adaptive distribution transformer detection framework is proposed to help firefighters and researchers develop optimal fire management strategies. On the one hand, this framework embeds a context aggregation layer with a masking strategy in the feature extractor to improve the representation of low-level and salient features. The masking strategy can reduce irrelevant information and improve network generalization. On the other hand, designed a dynamic adaptive direction conversion function and sample allocation strategy to fully use adaptive point representation while achieving accurate positioning and classification of fires and screening out representative fire samples in complex backgrounds. In addition, to prevent the network from being limited to the local optimum and discrete points in the sample from causing severe interference to the overall performance, designed a weighted loss function with spatial constraints to optimize the network and penalize the discrete points in the sample. The mAP in the three baseline data sets of FireDets, WildFurgFires, and FireAndSmokes are 0.871, 0.909, and 0.955, respectively. The experimental results are significantly better than other detection methods, which proves that the proposed method has good robustness and detection performance.
Published: 2024
Full Text: View/download PDF

46. Enhancing Automatic Modulation Recognition for IoT Applications Using Transformers

Author: Narges Rashvand, Kenneth Witham, Gabriel Maldonado, Vinit Katariya, Nishanth Marer Prabhu, Gunar Schirner, and Hamed Tabkhi
Subjects: automatic modulation recognition, deep learning, attention mechanism, Transformer network, IoT, Computer software, QA76.75-76.765, Technology, Cybernetics, Q300-390
Abstract: Automatic modulation recognition (AMR) is vital for accurately identifying modulation types within incoming signals, a critical task for optimizing operations within edge devices in IoT ecosystems. This paper presents an innovative approach that leverages Transformer networks, initially designed for natural language processing, to address the challenges of efficient AMR. Our Transformer network architecture is designed with the mindset of real-time edge computing on IoT devices. Four tokenization techniques are proposed and explored for creating proper embeddings of RF signals, specifically focusing on overcoming the limitations related to the model size often encountered in IoT scenarios. Extensive experiments reveal that our proposed method outperformed advanced deep learning techniques, achieving the highest recognition accuracy. Notably, our model achieved an accuracy of 65.75 on the RML2016 and 65.80 on the CSPB.ML.2018+ dataset.
Published: 2024
Full Text: View/download PDF

47. Shots segmentation-based optimized dual-stream framework for robust human activity recognition in surveillance video

Author: Altaf Hussain, Samee Ullah Khan, Noman Khan, Waseem Ullah, Ahmed Alkhayyat, Meshal Alharbi, and Sung Wook Baik
Subjects: Activity Recognition, Video Classification, Surveillance System, Lowlight Image Enhancement, Dual Stream Network, Transformer Network, Engineering (General). Civil engineering (General), TA1-2040
Abstract: Nowadays, for controlling crime, surveillance cameras are typically installed in all public places to ensure urban safety and security. However, automating Human Activity Recognition (HAR) using computer vision techniques faces several challenges such as lowlighting, complex spatiotemporal features, clutter backgrounds, and inefficient utilization of surveillance system resources. Existing attempts in HAR designed straightforward networks by analyzing either spatial or motion patterns resulting in limited performance while the dual streams methods are entirely based on Convolutional Neural Networks (CNN) that are inadequate to learning the long-range temporal information for HAR. To overcome the above-mentioned challenges, this paper proposes an optimized dual stream framework for HAR which mainly consists of three steps. First, a shots segmentation module is introduced in the proposed framework to efficiently utilize the surveillance system resources by enhancing the lowlight video stream and then it detects salient video frames that consist of human. This module is trained on our own challenging Lowlight Human Surveillance Dataset (LHSD) which consists of both normal and different levels of lowlighting data to recognize humans in complex uncertain environments. Next, to learn HAR from both contextual and motion information, a dual stream approach is used in the feature extraction. In the first stream, it freezes the learned weights of the backbone Vision Transformer (ViT) B-16 model to select the discriminative contextual information. In the second stream, ViT features are then fused with the intermediate encoder layers of FlowNet2 model for optical flow to extract a robust motion feature vector. Finally, a two stream Parallel Bidirectional Long Short-Term Memory (PBiLSTM) is proposed for sequence learning to capture the global semantics of activities, followed by Dual Stream Multi-Head Attention (DSMHA) with a late fusion strategy to optimize the huge features vector for accurate HAR. To assess the strength of the proposed framework, extensive empirical results are conducted on real-world surveillance scenarios and various benchmark HAR datasets that achieve 78.6285%, 96.0151%, and 98.875% accuracies on HMDB51, UCF101, and YouTube Action, respectively. Our results show that the proposed strategy outperforms State-of-the-Art (SOTA) methods. The proposed framework gives superior performance in HAR, providing accurate and reliable recognition of human activities in surveillance systems.
Published: 2024
Full Text: View/download PDF

48. Spatio-Temporal Feature Aware Vision Transformers for Real-Time Unmanned Aerial Vehicle Tracking

Author: Hao Zhang, Hengzhou Ye, Xiaoyu Guo, Xu Zhang, Yao Rong, and Shuiwang Li
Subjects: UAV tracking, temporal relationships, spatial neighborhood feature extraction, transformer network, real-time tracking, Motor vehicles. Aeronautics. Astronautics, TL1-4050
Abstract: Driven by the rapid advancement of Unmanned Aerial Vehicle (UAV) technology, the field of UAV object tracking has witnessed significant progress. This study introduces an innovative single-stream UAV tracking architecture, dubbed NT-Track, which is dedicated to enhancing the efficiency and accuracy of real-time tracking tasks. Addressing the shortcomings of existing tracking systems in capturing temporal relationships between consecutive frames, NT-Track meticulously analyzes the positional changes in targets across frames and leverages the similarity of the surrounding areas to extract feature information. Furthermore, our method integrates spatial and temporal information seamlessly into a unified framework through the introduction of a temporal feature fusion technique, thereby bolstering the overall performance of the model. NT-Track also incorporates a spatial neighborhood feature extraction module, which focuses on identifying and extracting features within the neighborhood of the target in each frame, ensuring continuous focus on the target during inter-frame processing. By employing an improved Transformer backbone network, our approach effectively integrates spatio-temporal information, enhancing the accuracy and robustness of tracking. Our experimental results on several challenging benchmark datasets demonstrate that NT-Track surpasses existing lightweight and deep learning trackers in terms of precision and success rate. It is noteworthy that, on the VisDrone2018 benchmark, NT-Track achieved a precision rate of 90% for the first time, an accomplishment that not only showcases its exceptional performance in complex environments, but also confirms its potential and effectiveness in practical applications.
Published: 2025
Full Text: View/download PDF

49. TiDEFormer—a heterogenous stacking ensemble approach for time series forecasting of COVID-19 prevalence

Author: Prakash, Satya, Jalal, Anand Singh, and Pathak, Pooja
Published: 2024
Full Text: View/download PDF

50. Development of optimized cascaded LSTM with Seq2seqNet and transformer net for aspect-based sentiment analysis framework.

Author: Ramasamy, Mekala and Elangovan, Mohanraj
Abstract: The recent development of communication technologies made it possible for people to share opinions on various social media platforms. The opinion of the people is converted into small-sized textual data. Aspect Based Sentiment Analysis (ABSA) is a process used by businesses and other organizations to assess these textual data in order to comprehend people’s opinions about the services or products offered by them. The majority of earlier Sentiment Analysis (SA) research uses lexicons, word frequencies, or black box techniques to obtain the sentiment in the text. It should be highlighted that these methods disregard the relationships and interdependence between words in terms of semantics. Hence, an efficient ABSA framework to determine the sentiment from the textual reviews of the customers is developed in this work. Initially, the raw text review data is collected from the standard benchmark datasets. The gathered text reviews undergo text pre-processing to neglect the unwanted words and characters from the input text document. The pre-processed data is directly provided to the feature extraction phase in which the seq2seq network and transformer network are employed. Further, the optimal features from the two resultant features are chosen by utilizing the proposed Modified Bird Swarm-Ladybug Beetle Optimization (MBS-LBO). After obtaining optimal features, these features are fused together and given to the final detection model. Consequently, the Optimized Cascaded Long Short Term Memory (OCas-LSTM) is proposed for predicting the sentiments from the given review by the users. Here, the parameters are tuned optimally by the MBS-LBO algorithm, and also it is utilized for enhancing the performance rate. The experimental evaluation is made to reveal the excellent performance of the developed SA model by contrasting it with conventional models. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

280 results on '"Transformer Network"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources