3,854 results on '"spatial attention"'
Search Results
2. Noise-aware progressive multi-scale deepfake detection.
- Author
-
Ding, Xinmiao, Pang, Shuai, and Guo, Wen
- Abstract
The proliferation of fake images generated by deepfake techniques has significantly threatened the trustworthiness of digital information, leading to a pressing need for face forgery detection. However, due to the similarity between human face images and the subtlety of artefact information, most deep face forgery detection methods face certain challenges, such as incomplete extraction of artefact information, limited performance in detecting low-quality forgeries, and insufficient generalization across different datasets. To address these issues, this paper proposes a novel noise-aware multi-scale deepfake detection model. Firstly, a progressive spatial attention module is introduced, which learns two types of spatial feature weights: boosting weight and suppression weight. The boosting weight highlights salient regions, while the suppression weight enables the model to capture more subtle artifact information. Through multiple boosting-suppression stages, the proposed model progressively focuses on different facial regions and extracts multi-scale RGB features. Additionally, a noise-aware two-stream network is introduced, which leverages frequency-domain features and fuses image noise with multi-scale RGB features. This integration enhances the model's ability to handle image post-processing. Furthermore, the model learns global features from multi-modal features through multiple convolutional layers, which are combined with local similarity features for deepfake detection, thereby improving the model's robustness. Experimental results on several benchmark databases demonstrate the superiority of our proposed method over state-of-the-art techniques. Our contributions lie in the progressive spatial attention module, which effectively addresses overfitting in CNNs, and the integration of noise-aware features and multi-scale RGB features. These innovations lead to enhanced accuracy and generalization performance in face forgery detection. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. The Power of Trial History: How Previous Trial Shapes Audiovisual Integration.
- Author
-
Tang, Xiaoyu, Liu, Wanlong, Wu, Yingnan, Ren, Rongxia, Sun, Jiaying, Yang, Jiajia, Wang, Aijun, and Zhang, Ming
- Subjects
- *
SELECTIVITY (Psychology) , *TRIALS (Law) - Abstract
Combining information from visual and auditory modalities to form a unified and coherent perception is known as audiovisual integration. Audiovisual integration is affected by many factors. However, it remains unclear whether the trial history can influence audiovisual integration. We used a target–target paradigm to investigate how the target modality and spatial location of the previous trial affect audiovisual integration under conditions of divided-modalities attention (Experiment 1) and modality-specific selective attention (Experiment 2). In Experiment 1, we found that audiovisual integration was enhanced in the repeat locations compared with switch locations. Audiovisual integration was the largest following the auditory targets compared to following the visual and audiovisual targets. In Experiment 2, where participants were asked to focus only on visual, we found that the audiovisual integration effect was larger in the repeat location trials than switch location trials only when the audiovisual target was presented in the previous trial. The present results provide the first evidence that trial history can have an effect on audiovisual integration. The mechanisms of trial history modulating audiovisual integration are discussed. Future examining of audiovisual integration should carefully manipulate experimental conditions based on the effects of trial history. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. MCBAN: A Small Object Detection Multi-Convolutional Block Attention Network.
- Author
-
Bhanbhro, Hina, Hooi, Yew Kwang, Zakaria, Mohammad Nordin Bin, Kusakunniran, Worapan, and Amur, Zaira Hassan
- Abstract
Object detection has made a significant leap forward in recent years. However, the detection of small objects continues to be a great difficulty for various reasons, such as they have a very small size and they are susceptible to missed detection due to background noise. Additionally, small object information is affected due to the downsampling operations. Deep learning-based detection methods have been utilized to address the challenge posed by small objects. In this work, we propose a novel method, the Multi-Convolutional Block Attention Network (MCBAN), to increase the detection accuracy of minute objects aiming to overcome the challenge of information loss during the downsampling process. The multi-convolutional attention block (MCAB); channel attention and spatial attention module (SAM) that make up MCAB, have been crafted to accomplish small object detection with higher precision. We have carried out the experiments on the Karlsruhe Institute of Technology and Toyota Technological Institute (KITTI) and Pattern Analysis, Statical Modeling and Computational Learning (PASCAL) Visual Object Classes (VOC) datasets and have followed a step-wise process to analyze the results. These experiment results demonstrate that significant gains in performance are achieved, such as 97.75% for KITTI and 88.97% for PASCAL VOC. The findings of this study assert quite unequivocally the fact that MCBAN is much more efficient in the small object detection domain as compared to other existing approaches. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. TSRDet: A Table Structure Recognition Method Based on Row-Column Detection.
- Author
-
Zhu, Zixuan, Li, Weibin, Yu, Chenglong, Li, Wei, and Jiao, Licheng
- Subjects
TRANSFORMER models - Abstract
As one of the most commonly used and important data carriers, tables have the advantages of high structuring, strong readability and strong flexibility. However, in reality, tables usually present various forms, such as Excel, images, etc. Among them, the information in the table image cannot be read directly, let alone further applied. Therefore, the research related to image-based table recognition is crucial. It contains the table structure recognition and the table content recognition. Among them, table structure recognition is the most important and difficult task because the table structure is abstract and changeable. In order to address this problem, we propose an innovative table structure recognition method, named TSRDet (Table Structure Recognition based on object Detection). It includes a row-column detection method, named SACNet (StripAttention-CenterNet) and the corresponding post-processing. SACNet is an improved version of the original CenterNet. The specific improvements include the following: firstly, we introduce the Swin Transformer as the encoder to obtain the global feature map of the image. Then, we propose a plug-and-play row-column attention module, including a channel attention module and a row-column spatial attention module. It improves the detection accuracy of rows and columns by capturing long-range row-column feature maps in the image. After completing the row-column detection, this paper also designs a simple and fast post-processing to generate the table structure based on the row-column detection results. Experimental results show that for row-column detection, SACNet has high detection accuracy, even at a high IoU threshold. Specifically, when the threshold is 0.75, its mAP of row detection and column detection still exceeds 90%, which is 91.40% and 92.73% respectively. In addition, in the comparative experiment with the existing object detection methods, SACNet's performance was significantly better than that of all others. For table structure recognition, the TEDS-Struct score of TSRDet is 95.7%, which shows competitive performance in table structure recognition, and verifies the rationality and superiority of the proposed method. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. Trademark Detection and Classification Based on YOLO-FGE.
- Author
-
MIAO Chunyuan and WANG Xiuhui
- Subjects
TRADEMARKS ,CLASSIFICATION ,SPINE ,NECK - Abstract
In order to solve the trademarks'problems about their numerous styles, complex backgrounds, and large-scale changes, a YOLO-FGE network model based on the YOLOv5 framework is proposed to distinguish trademark category information more accurately. Firstly, a feature enhancement module is put forward to enhance the adaptability of the fea- ture layer to different kinds of trademarks, making the network pay more attention to the useful information of trademarks to be detected. Secondly, the global information attention module is embedded in the C3 module of YOLOv5 to optimize the backbone and neck network. Finally, an enhanced spatial attention module is raised, which uses dilated convolution to expand the receptive field, combines channel attention and Transformer module to improve the detection accuracy. The experimental results on the graphic trademark dataset show that the model improves mAP to 92.3%, which has higher detection accuracy than most existing methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. Spatial attention in mental arithmetic: A literature review and meta-analysis.
- Author
-
Prado, Jérôme and Knops, André
- Subjects
- *
LITERATURE reviews , *JUDGMENT (Psychology) , *EYE movements , *ARITHMETIC , *TRANSCODING , *MENTAL arithmetic - Abstract
We review the evidence for the conceptual association between arithmetic and space and quantify the effect size in meta-analyses. We focus on three effects: (a) the operational momentum effect (OME), which has been defined as participants' tendency to overestimate results of addition problems and underestimate results of subtraction problems; (b) the arithmetic cueing effect, in which arithmetic problems serve as spatial cues in target detection or temporal order judgment tasks; and (c) the associations between arithmetic and space observed with eye- and hand-tracking studies. The OME was consistently found in paradigms that provided the participants with numerical response alternatives. The OME shows a large effect size, driven by an underestimation during subtraction while addition was unbiased. In contrast, paradigms in which participants indicated their estimate by transcoding their final estimate to a spatial reference frame revealed no consistent OME. Arithmetic cueing studies show a reliable small to medium effect size, driven by a rightward bias for addition. Finally, eye- and hand-tracking studies point to replicable associations between arithmetic and eye or hand movements. To account for the complexity of the observed pattern, we introduce the Adaptive Pathways in Mental Arithmetic (APiMA) framework. The model accommodates central notions of numerical and arithmetic processing and helps identifying which pathway a given paradigm operates on. It proposes that the divergence between OME and arithmetic cueing studies comes from the predominant use of non-symbolic versus symbolic stimuli, respectively. Overall, our review and findings clearly support an association between arithmetic and spatial processing. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. Electroencephalogram Decoding Reveals Distinct Processes for Directing Spatial Attention and Encoding Into Working Memory.
- Author
-
Jones, Henry M., Diaz, Gisella K., Ngiam, William X. Q., and Awh, Edward
- Subjects
- *
SHORT-term memory , *SPATIAL memory , *VISUAL memory , *ELECTROENCEPHALOGRAPHY , *ATTENTION , *ADULTS - Abstract
Past work reveals a tight relationship between spatial attention and storage in visual working memory. But is spatially attending an item tantamount to working memory encoding? Here, we tracked electroencephalography (EEG) signatures of spatial attention and working memory encoding while independently manipulating the number of memory items and the spatial extent of attention in two studies of adults (N = 39; N = 33). Neural measures of spatial attention tracked the position and size of the attended area independent of the number of individuated items encoded into working memory. At the same time, multivariate decoding of the number of items stored in working memory was insensitive to variations in the breadth and position of spatial attention. Finally, representational similarity analyses provided converging evidence for a pure load signal that is insensitive to the spatial extent of the stored items. Thus, although spatial attention is a persistent partner of visual working memory, it is functionally dissociable from the selection and maintenance of individuated representations in working memory. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. Attentional templates for target features versus locations.
- Author
-
Jimenez, Mikel, Wang, Ziyi, and Grubert, Anna
- Subjects
- *
VISUAL memory , *VISUAL perception , *NEURAL circuitry , *ELECTROPHYSIOLOGY - Abstract
Visual search is guided by visual working memory representations (i.e., attentional templates) that are activated prior to search and contain target-defining features (e.g., color). In the present study, we tested whether attentional templates can also contain spatial target properties (knowing where to look for) and whether attentional selection guided by such feature-specific templates is equally efficient than selection that is based on feature-specific templates (knowing what to look for). In every trial, search displays were either preceded by semantic color or location cues, indicating the upcoming target color or location, respectively. Qualitative differences between feature- and location-based template guidance were substantiated in terms of selection efficiency in low-load (one target color/location) versus high-load trials (two target colors/locations). Behavioral and electrophysiological (N2pc) measures of target selection speed and accuracy were combined for converging evidence. In line with previous studies, we found that color search was highly efficient, even under high-low conditions, when multiple attentional templates were activated to guide attentional selection in a spatially global fashion. Importantly, results in the location task almost perfectly mirrored the findings of the color task, suggesting that multiple templates for different target locations were activated concurrently when two possible target locations were task relevant. Our findings align with accounts that assume a common neuronal network during preparation for location and color search, but regard spatial and feature-based selection mechanisms as independent. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. Reading direction interacts with spatial processes of temporal order verbal working memory: evidence from Iranian right-to-left readers.
- Author
-
Rasoulzadeh, Vesal, van Dijck, Jean-Phillipe, Khosrowabadi, Reza, and Fias, Wim
- Subjects
- *
READING , *RESEARCH funding , *T-test (Statistics) , *ELECTROENCEPHALOGRAPHY , *DESCRIPTIVE statistics , *ATTENTION , *MEMORY , *ANALYSIS of variance , *SPACE perception , *DATA analysis software , *LEARNING strategies , *REGRESSION analysis - Abstract
Reading direction affects cognitive processes such as spatial processing and attention. Testing Farsi right-to-left readers, we investigated the effect of reading direction on the involved spatial and attentional processes in a verbal working memory task. Previous research in left-to-right readers has shown that serial order in verbal WM is encoded spatially from left to right and that mechanisms of spatial attention operate on these mental representations. First, we confirmed that the spatial representation of serial information in WM follows the right-to-left reading direction of Farsi. Second, we demonstrated the influence of reading direction on the distribution of spatial attention over the reading-direction-contingent mental representation of serial information in WM. Behavioural RTs and neural markers of spatial attention shifts (EDAN and ADAN ERPs and lateralised posterior alpha activity) confirm the right-to-left representation of serial order in WM and the involvement of spatial attention in retrieving items from this mnemonic space. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. Predictability modulates the early neural coding of spatially unattended fearful faces.
- Author
-
Chalk, Philip T. and Pegna, Alan J.
- Subjects
FACIAL expression ,BIOLOGICAL neural networks ,EVOKED potentials (Electrophysiology) ,HUMAN behavior ,NEURAL codes - Published
- 2024
- Full Text
- View/download PDF
12. Interaction of spatial attention and the associated reward value of audiovisual objects.
- Author
-
Vakhrushev, Roman and Pooresmaeili, Arezoo
- Subjects
AUDITORY perception ,EVOKED potentials (Electrophysiology) ,AUDIOVISUAL materials ,HUMAN behavior ,NEUROLOGIC examination - Published
- 2024
- Full Text
- View/download PDF
13. 基于通道和空间注意力的带钢表面缺陷显著性目标检测.
- Author
-
郭华平, 李锡瑞, 张莉, 孙艳歌, and 付志鹏
- Abstract
Copyright of Journal of Xinyang Normal University Natural Science Edition is the property of Journal of Xinyang Normal University Editorial Office and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
14. Visual spatial attention to sexual stimuli.
- Author
-
Snowden, Robert J., Kydd-Coutts, Megan, Varney, Ellie-May, Rosselli, Olivia, and Gray, Nicola S.
- Subjects
RESOURCE allocation ,STIMULUS & response (Psychology) ,ATTENTION ,HYPOTHESIS - Abstract
Visual events of high salience are thought to automatically attract visual processing resources to their location. Hence, we should expect that stimuli with sexual content should trigger such a movement of visual resources. However, evidence for such an allocation of visual resources is sparse and rather contradictory. In two studies we tested this hypothesis. Using a dot-probe task, Experiment 1 showed that targets occurring at the location of a briefly presented and uninformative cue (hence engaging "exogenous" attention) with sexual content were responded to more rapidly than those that occurred at the location of the neutral cue - thus confirming that sexual stimuli can attract automatic attention to their location. However, the effect was small and had a low level of reliability. No consistent gender differences were found. In Experiment 2, we examined whether this cueing effect remained even for low-visibility cues. No cueing effects were found, but the task manipulation also abolished the cueing effect for high visibility cues. While the study supports the notion of spatial allocation of visual resources to sexual stimuli, it highlights that this effect is not robust or reliable, and discusses the implications of this. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
15. Multiscale unsupervised network for deformable image registration.
- Author
-
Wang, Yun, Chang, Wanru, Huang, Chongfei, and Kong, Dexing
- Subjects
- *
IMAGE segmentation , *PIXELS , *RECORDING & registration , *ANNOTATIONS - Abstract
Deformable image registration (DIR) plays an important part in many clinical tasks, and deep learning has made significant progress in DIR over the past few years.To propose a fast multiscale unsupervised deformable image registration (referred to as FMIRNet) method for monomodal image registration.We designed a multiscale fusion module to estimate the large displacement field by combining and refining the deformation fields of three scales. The spatial attention mechanism was employed in our fusion module to weight the displacement field pixel by pixel. Except mean square error (MSE), we additionally added structural similarity (ssim) measure during the training phase to enhance the structural consistency between the deformed images and the fixed images.Our registration method was evaluated on EchoNet, CHAOS and SLIVER, and had indeed performance improvement in terms of SSIM, NCC and NMI scores. Furthermore, we integrated the FMIRNet into the segmentation network (FCN, UNet) to boost the segmentation task on a dataset with few manual annotations in our joint leaning frameworks. The experimental results indicated that the joint segmentation methods had performance improvement in terms of Dice, HD and ASSD scores.Our proposed FMIRNet is effective for large deformation estimation, and its registration capability is generalizable and robust in joint registration and segmentation frameworks to generate reliable labels for training segmentation tasks. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. Hemispheric Asymmetry in TMS-Induced Effects on Spatial Attention: A Meta-Analysis.
- Author
-
Wang, Ting, de Graaf, Tom, Tanner, Lisabel, Schuhmann, Teresa, Duecker, Felix, and Sack, Alexander T.
- Subjects
- *
TRANSCRANIAL magnetic stimulation , *CEREBRAL dominance , *ATTENTIONAL bias , *BRAIN stimulation , *BISECTORS (Geometry) - Abstract
Hemispheric asymmetry is a fundamental principle in the functional architecture of the brain. It plays an important role in attention research where right hemisphere dominance is core to many attention theories. Lesion studies seem to confirm such hemispheric dominance with patients being more likely to develop left hemineglect after right hemispheric stroke than vice versa. However, the underlying concept of hemispheric dominance is still not entirely clear. Brain stimulation studies using transcranial magnetic stimulation (TMS) might be able to illuminate this concept. To examine the putative hemispheric asymmetry in spatial attention, we conducted a meta-analysis of studies applying inhibitory TMS protocols to the left or right posterior parietal cortices (PPC), assessing effects on attention biases with the landmark and line bisection task. A total of 18 studies including 222 participants from 1994 to February 2022 were identified. The analysis revealed a significant shift of the perceived midpoint towards the ipsilateral hemifield after right PPC suppression (Cohen's d = 0.52), but no significant effect after left PPC suppression (Cohen's d = 0.26), suggesting a hemispheric asymmetry even though the subgroup difference does not reach significance (p =.06). A complementary Bayesian meta-analysis revealed a high probability of at least a medium effect size after right PPC disruption versus a low probability after left PPC disruption. This is the first quantitative meta-analysis supporting right hemisphere-specific TMS-induced spatial attention deficits, mimicking hemineglect in healthy participants. We discuss the result in the light of prominent attention theories, ultimately concluding how difficult it remains to differentiate between these theories based on attentional bias scores alone. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. Smartphone App to Detect Pathological Myopia Using Spatial Attention and Squeeze‐Excitation Network as a Classifier and Segmentation Encoder.
- Author
-
Ali, Sarvat and Raut, Shital
- Subjects
- *
VISION disorders , *ARTIFICIAL intelligence , *MOBILE apps , *GENERAL practitioners , *MEDICAL screening - Abstract
Pathological myopia (PM) is a worldwide visual health concern that can cause irreversible vision impairment. It affects up to 20 crore population, causing social and economic burdens. Initial screening of PM using computer‐aided diagnosis (CAD) can prevent loss of time and finances for intricate treatments later on. Current research works utilizes complex models that are too resource‐intensive or lack explanations behind the categorizations. To emphasize the significance of artificial intelligence for the ophthalmic usage and address the limitations of the current studies, we have designed a mobile‐compatible application for smartphone users to detect PM. For this purpose, we have developed a lightweight model, using the enhanced MobileNetV3 architecture integrated with spatial attention (SA) and squeeze‐excitation (SE) modules to effectively capture lesion location and channel features. To demonstrate its robustness, the model is tested against three heterogeneous datasets namely PALM, RFMID, and ODIR reporting the area under curve (AUC) score of 0.9983, 0.95, and 0.94, respectively. In order to support PM categorization and demonstrate its correlation with the associated lesions, we have segmented different forms of PM lesion atrophy, which gave us intersection over union (IOU) scores of 0.96 and fscore of 0.97 using the same SA+SE inclusive MobileNetV3 as an encoder. This lesion segmentation can aid ophthalmologists in further analysis and treatment. The optimized and explainable model version is calibrated to develop the smartphone application, which can identify fundus image as PM or normal vision. This app is appropriate for ophthalmologists seeking second opinions or by rural general practitioners to refer PM cases to specialists. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. Implementation of an online spacing flanker task and evaluation of its test–retest reliability using measures of inhibitory control and the distribution of spatial attention.
- Author
-
Lee, Sang Ho and Pitt, Mark A.
- Subjects
- *
RESPONSE inhibition , *STATISTICAL reliability , *TASK analysis , *INTEGRATED software , *PSYCHOPHYSICS - Abstract
The flanker task (Eriksen & Eriksen, Perception & Psychophysics, 16(1), 143-149, 1974) has been highly influential and widely used in studies of visual attention. Its simplicity has made it popular to include it in experimental software packages and online platforms. The spacing flanker task (SFT), in which the distance between the target and flankers varies, is useful for studying the distribution of attention across space as well as inhibitory control. Use of the SFT requires that the viewing environment (e.g., stimulus size and viewing distance) be controlled, which is a challenge for online delivery. We implement and evaluate an online version of the SFT that includes two calibration pretests to provide the necessary control. Test–retest and split-half reliability of the online version was compared with a laboratory version on measures of inhibitory control and measures of the distribution of attention across space. Analyses show that the online SFT is comparable to laboratory testing on all measures. Results also identify two measures with good test–retest reliability that hold promise for studying performance in the SFT: the mean flanker effect (ICC = 0.745) and RTs on incongruent trials across distances (ICC = 0.65–0.71). [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
19. Low Lumination Image Enhancement with Transformer based Curve Learning.
- Author
-
Yulin Cao, Chunyu Li, Guoqing Zhang, and Yuhui Zheng
- Subjects
CONVOLUTIONAL neural networks ,LEARNING curve ,TRANSFORMER models ,IMAGE intensifiers ,ALGORITHMS - Abstract
Images taken in low lamination condition suffer from low contrast and loss of information. Low lumination image enhancement algorithms are required to improve the quality and broaden the applications of such images. In this study, we proposed a new Low lumination image enhancement architecture consisting of a transformer-based curve learning and an encoder-decoder-based texture enhancer. Considering the high effectiveness of curve matching, we constructed a transformer-based network to estimate the learnable curve for pixel mapping. Curve estimation requires global relationships that can be extracted through the transformer framework. To further improve the texture detail, we introduced an encoder-decoder network to extract local features and suppress the noise. Experiments on LOL and SID datasets showed that the proposed method not only has competitive performance compared to state-of-the-art techniques but also has great efficiency. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
20. 基于校正遮挡感知的光场深度估计.
- Author
-
倪 竞, 邓慧萍, 向 森, and 吴 谨
- Subjects
FEATURE extraction ,OPTICAL images ,ALGORITHMS ,ENCODING - Abstract
Copyright of Chinese Journal of Liquid Crystal & Displays is the property of Chinese Journal of Liquid Crystal & Displays and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
21. DaSAM: Disease and Spatial Attention Module-Based Explainable Model for Brain Tumor Detection.
- Author
-
Tehsin, Sara, Nasir, Inzamam Mashood, Damaševičius, Robertas, and Maskeliūnas, Rytis
- Subjects
CANCER diagnosis ,BRAIN cancer diagnosis ,BRAIN tumors ,MAGNETIC resonance imaging ,NOSOLOGY - Abstract
Brain tumors are the result of irregular development of cells. It is a major cause of adult demise worldwide. Several deaths can be avoided with early brain tumor detection. Magnetic resonance imaging (MRI) for earlier brain tumor diagnosis may improve the chance of survival for patients. The most common method of diagnosing brain tumors is MRI. The improved visibility of malignancies in MRI makes therapy easier. The diagnosis and treatment of brain cancers depend on their identification and treatment. Numerous deep learning models are proposed over the last decade including Alexnet, VGG, Inception, ResNet, DenseNet, etc. All these models are trained on a huge dataset, ImageNet. These general models have many parameters, which become irrelevant when implementing these models for a specific problem. This study uses a custom deep-learning model for the classification of brain MRIs. The proposed Disease and Spatial Attention Model (DaSAM) has two modules; (a) the Disease Attention Module (DAM), to distinguish between disease and non-disease regions of an image, and (b) the Spatial Attention Module (SAM), to extract important features. The experiments of the proposed model are conducted on two multi-class datasets that are publicly available, the Figshare and Kaggle datasets, where it achieves precision values of 99% and 96%, respectively. The proposed model is also tested using cross-dataset validation, where it achieved 85% accuracy when trained on the Figshare dataset and validated on the Kaggle dataset. The incorporation of DAM and SAM modules enabled the functionality of feature mapping, which proved to be useful for the highlighting of important features during the decision-making process of the model. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. ALFormer: Attribute Localization Transformer in Pedestrian Attribute Recognition.
- Author
-
Yuxin Liu, Mingzhe Wang, Chao Li, and Shuoyan Liu
- Abstract
Pedestrian attribute recognition is an important task for intelligent video surveillance. However, existing methods struggle to accurately localize discriminative regions for each attribute. We propose Attribute Localization Transformer (ALFormer), a novel framework to improve spatial localization through two key components. First, we introduce Mask Contrast Learning (MCL) to suppress regional feature relevance, forcing the model to focus on intrinsic spatial areas for each attribute. Second, we design an Attribute Spatial Memory (ASM) module to generate reliable attention maps that capture inherent locations for each attribute. Extensive experiments on two benchmark datasets demonstrate state-of-the-art performance of ALFormer. Ablation studies and visualizations verify the effectiveness of the proposed modules in improving attribute localization. Our work provides a simple yet effective approach to exploit spatial consistency for enhanced pedestrian attribute recognition. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. QualityNet: A multi-stream fusion framework with spatial and channel attention for blind image quality assessment
- Author
-
Muhammad Azeem Aslam, Xu Wei, Hassan Khalid, Nisar Ahmed, Zhu Shuangtong, Xin Liu, and Yimei Xu
- Subjects
Image Quality Assessment ,Blind Image Quality Assessment ,BIQA ,Spatial Attention ,Channel Attention ,Multi-stream Model and objective quality assessment ,Medicine ,Science - Abstract
Abstract This study introduces a novel Blind Image Quality Assessment (BIQA) approach leveraging a multi-stream spatial and channel attention model. Our method addresses challenges posed by diverse image content and distortions by integrating feature maps from two distinct backbones. Through spatial and channel attention mechanisms, our algorithm prioritizes regions of interest, enhancing its ability to capture crucial image details. Extensive evaluations on four benchmark datasets demonstrate superior performance compared to existing methods, closely aligning with human perceptual assessment. Our approach exhibits exceptional generalization capabilities on both authentic and synthetic distortion databases. Moreover, it demonstrates a distinctive focus on perceptual foreground information, enhancing its practical applicability. Thorough quantitative analyses underscore the algorithm’s superior performance, establishing its dominance over existing methods.
- Published
- 2024
- Full Text
- View/download PDF
24. Attentional templates for target features versus locations
- Author
-
Mikel Jimenez, Ziyi Wang, and Anna Grubert
- Subjects
Visual attention ,Attentional templates ,Spatial attention ,Feature-based attention ,ERP ,N2pc ,Medicine ,Science - Abstract
Abstract Visual search is guided by visual working memory representations (i.e., attentional templates) that are activated prior to search and contain target-defining features (e.g., color). In the present study, we tested whether attentional templates can also contain spatial target properties (knowing where to look for) and whether attentional selection guided by such feature-specific templates is equally efficient than selection that is based on feature-specific templates (knowing what to look for). In every trial, search displays were either preceded by semantic color or location cues, indicating the upcoming target color or location, respectively. Qualitative differences between feature- and location-based template guidance were substantiated in terms of selection efficiency in low-load (one target color/location) versus high-load trials (two target colors/locations). Behavioral and electrophysiological (N2pc) measures of target selection speed and accuracy were combined for converging evidence. In line with previous studies, we found that color search was highly efficient, even under high-low conditions, when multiple attentional templates were activated to guide attentional selection in a spatially global fashion. Importantly, results in the location task almost perfectly mirrored the findings of the color task, suggesting that multiple templates for different target locations were activated concurrently when two possible target locations were task relevant. Our findings align with accounts that assume a common neuronal network during preparation for location and color search, but regard spatial and feature-based selection mechanisms as independent.
- Published
- 2024
- Full Text
- View/download PDF
25. Dual-branch network object detection algorithm based on dual-modality fusion of visible and infrared images.
- Author
-
Hou, ZhiQiang, Li, Xinyue, Yang, Chen, Ma, Sugang, Yu, Wangsheng, and Wang, Yunchen
- Abstract
Aiming at the limitations of visible images in object detection, this paper proposes a dual-branch network object detection algorithm based on dual-modality fusion of visible and infrared images. Based on YOLOv7-s, the algorithm firstly introduces a spatial attention module to enhance the model’s ability of capturing key features; secondly, to resolve the problem of inconsistent object sizes, a visible multi-scale feature fusion module is proposed, meanwhile, the structure of the SimCSPSPPF module (an improved spatial pyramid pooling module) from YOLOv6 is adopted to construct an infrared multi-scale feature fusion module to efficiently extract multi-scale features from infrared images; finally, a cross-modal feature fusion module is proposed to fuse corresponding scale features from visible and infrared images. The proposed algorithm is tested on KAIST, FLIR, and GIR datasets, experimental results show that the proposed algorithm has better performance, compared with the YOLOv7-s algorithm to detect visible and infrared images separately on the KAIST dataset, the detection accuracy is improved by 18.0 and 5.1%, respectively, and detection speed is 51.8 FPS; on FLIR and GIR datasets, the proposed algorithm also demonstrates significant advantages. Furthermore, the proposed algorithm can detect objects on individual visible or infrared images while maintaining high detection accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
26. The Use of Attention-Enhanced CNN-LSTM Models for Multi-Indicator and Time-Series Predictions of Surface Water Quality.
- Author
-
Zhang, Minhao, Zhang, Zhiyu, Wang, Xuan, Liao, Zhenliang, and Wang, Lijin
- Subjects
CONVOLUTIONAL neural networks ,LONG-term memory ,WATER quality ,DEEP learning ,TIME series analysis ,WATER quality monitoring - Abstract
Deep learning (DL) has recently been applied to surface water quality prediction, whereas its online monitoring data consists of multiple indicators and time series, which are challenging for prediction models due to complex temporal dependencies and inter-indicator mechanisms. Convolutional neural network (CNN) and long short term memory (LSTM) can be used for indicator and temporal domains respectively, but still lack the ability to represent complex patterns in surface water quality. Since attention mechanisms are designed to effectively focus on the most crucial information, spatial attention mechanism (SAM) and temporal attention mechanism (TAM) are suitable for dealing with the above multi-indicator and time series issues. This work incorporates SAM and TAM into the CNN-LSTM model to form 4 DL models for water quality prediction including CNN-LSTM, SAM-enhanced CNN-LSTM, TAM-enhanced CNN-LSTM, and the CNN-LSTM enhanced by both attention mechanisms. Four surface water online monitoring sites are used as case studies to examine the models in predicting three water quality indicators including dissolved oxygen (DO), ammonia nitrogen (NH
3 -N), and total organic carbon (TOC). According to the case results of the 4 models after training with similar training epochs, the prediction accuracies of attention-enhanced models are better than the CNN-LSTM model, and the model with both attention mechanisms generally achieves the best performance among the 4 models. The prediction NSE of DO by the four models are 0.817, 0.948, 0.952, and 0.967 respectively in a representative case Jiujiang. The results demonstrate that spatial and temporal attention can analyze correlations from multiple indicators and time series of water quality data respectively, to improve the accuracy of surface water quality prediction. [ABSTRACT FROM AUTHOR]- Published
- 2024
- Full Text
- View/download PDF
27. BAIDet: remote sensing image object detector based on background and angle information.
- Author
-
Yu, Jiangfeng, Sun, Lin, Song, Shuhua, Guo, Guolong, and Chen, Kai
- Abstract
Remote sensing object detection, with large differences in object size, arbitrary orientation and tight arrangement, leads to difficulties in object recognition and localization. Therefore, a remote sensing image object Detector (BAIDet) based on Background and Angle Information is proposed in this paper. Firstly, a large convolutional kernel global attention module is designed to fully utilize the global information of remote sensing images by expanding the receptive field. And obtain the edge information of ground objects through deformable convolution. Secondly, an angle-sensitive probabilistic intersection-over-union loss function (AS-ProbIoU Loss) is developed for bounding box regression for oriented object detection. Finally, experimental results on four remote sensing image datasets, DOTA, HRSC 2016, UCAS-AOD, and DIOR-R, demonstrated the effectiveness of this method. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
28. Attention spotlight in V1-based cortico-cortical interactions in human visual hierarchy
- Author
-
Yanyu Zhang, Xilin Zhang, Xincheng Lu, and Nihong Chen
- Subjects
Spatial attention ,fMRI ,Connectivity ,DCM ,V1 ,Visual cortex ,Medicine ,Science - Abstract
Abstract Attention is often viewed as a mental spotlight, which can be scaled like a zoom lens at specific spatial locations and features a center-surround gradient. Here, we demonstrate a neural signature of attention spotlight in signal transmission along the visual hierarchy. fMRI background connectivity analysis was performed between retinotopic V1 and downstream areas to characterize the spatial distribution of inter-areal interaction under two attentional states. We found that, compared to diffused attention, focal attention sharpened the spatial gradient in the strength of the background connectivity. Dynamic causal modeling analysis further revealed the effect of attention in both the feedback and feedforward connectivity between V1 and extrastriate cortex. In a context which induced a strong effect of crowding, the effect of attention in the background connectivity profile diminished. Our findings reveal a context-dependent attention prioritization in information transmission via modulating the recurrent processing across the early stages in human visual cortex.
- Published
- 2024
- Full Text
- View/download PDF
29. Sustained bias of spatial attention in a 3 T MRI scanner
- Author
-
Stefan Smaczny, Leonie Behle, Sara Kuppe, Hans-Otto Karnath, and Axel Lindner
- Subjects
Magnetic vestibular stimulation (MVS) ,MRI ,Spatial attention ,Spatial neglect ,Nystagmus ,Vestibular ocular reflex (VOR) ,Medicine ,Science - Abstract
Abstract When lying inside a MRI scanner and even in the absence of any motion, the static magnetic field of MRI scanners induces a magneto-hydrodynamic stimulation of subjects’ vestibular organ (MVS). MVS thereby not only causes a horizontal vestibular nystagmus but also induces a horizontal bias in spatial attention. In this study, we aimed to determine the time course of MVS-induced biases in both VOR and spatial attention inside a 3 T MRI-scanner as well as their respective aftereffects after participants left the scanner. Eye movements and overt spatial attention in a visual search task were assessed in healthy volunteers before, during, and after a one-hour MVS period. All participants exhibited a VOR inside the scanner, which declined over time but never vanished completely. Importantly, there was also an MVS-induced horizontal bias in spatial attention and exploration, which persisted throughout the entire hour within the scanner. Upon exiting the scanner, we observed aftereffects in the opposite direction manifested in both the VOR and in spatial attention, which were statistically no longer detectable after 7 min. Sustained MVS effects on spatial attention have important implications for the design and interpretation of fMRI-studies and for the development of therapeutic interventions counteracting spatial neglect.
- Published
- 2024
- Full Text
- View/download PDF
30. PH-CBAM: A Parallel Hybrid CBAM Network with Multi-Feature Extraction for Facial Expression Recognition.
- Author
-
Liao, Liefa, Wu, Shouluan, Song, Chao, and Fu, Jianglong
- Subjects
CONVOLUTIONAL neural networks ,FACIAL expression ,FEATURE extraction ,EMOTIONS ,ATTENTION - Abstract
Convolutional neural networks have made significant progress in human Facial Expression Recognition (FER). However, they still face challenges in effectively focusing on and extracting facial features. Recent research has turned to attention mechanisms to address this issue, focusing primarily on local feature details rather than overall facial features. Building upon the classical Convolutional Block Attention Module (CBAM), this paper introduces a novel Parallel Hybrid Attention Model, termed PH-CBAM. This model employs split-channel attention to enhance the extraction of key features while maintaining a minimal parameter count. The proposed model enables the network to emphasize relevant details during expression classification. Heatmap analysis demonstrates that PH-CBAM effectively highlights key facial information. By employing a multimodal extraction approach in the initial image feature extraction phase, the network structure captures various facial features. The algorithm integrates a residual network and the MISH activation function to create a multi-feature extraction network, addressing issues such as gradient vanishing and negative gradient zero point in residual transmission. This enhances the retention of valuable information and facilitates information flow between key image details and target images. Evaluation on benchmark datasets FER2013, CK+, and Bigfer2013 yielded accuracies of 68.82%, 97.13%, and 72.31%, respectively. Comparison with mainstream network models on FER2013 and CK+ datasets demonstrates the efficiency of the PH-CBAM model, with comparable accuracy to current advanced models, showcasing its effectiveness in emotion detection. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
31. Optimization and Application of Improved YOLOv9s-UI for Underwater Object Detection.
- Author
-
Pan, Wei, Chen, Jiabao, Lv, Bangjun, and Peng, Likun
- Subjects
LIGHTING - Abstract
The You Only Look Once (YOLO) series of object detection models is widely recognized for its efficiency and real-time performance, particularly under the challenging conditions of underwater environments, characterized by insufficient lighting and visual disturbances. By modifying the YOLOv9s model, this study aims to improve the accuracy and real-time capabilities of underwater object detection, resulting in the introduction of the YOLOv9s-UI detection model. The proposed model incorporates the Dual Dynamic Token Mixer (D-Mixer) module from TransXNet to improve feature extraction capabilities. Additionally, it integrates a feature fusion network design from the LocalMamba network, employing channel and spatial attention mechanisms. These attention modules effectively guide the feature fusion process, significantly enhancing detection accuracy while maintaining the model's compact size of only 9.3 M. Experimental evaluation on the UCPR2019 underwater object dataset shows that the YOLOv9s-UI model has higher accuracy and recall than the existing YOLOv9s model, as well as excellent real-time performance. This model significantly improves the ability of underwater target detection by introducing advanced feature extraction and attention mechanisms. The model meets portability requirements and provides a more efficient solution for underwater detection. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
32. Multiscale Tea Disease Detection with Channel–Spatial Attention.
- Author
-
Sun, Yange, Jiang, Mingyi, Guo, Huaping, Zhang, Li, Yao, Jianfeng, Wu, Fei, and Wu, Gaowei
- Abstract
Tea disease detection is crucial for improving the agricultural circular economy. Deep learning-based methods have been widely applied to this task, and the main idea of these methods is to extract multiscale coarse features of diseases using the backbone network and fuse these features through the neck for accurate disease detection. This paper proposes a novel tea disease detection method that enhances feature expression of the backbone network and the feature fusion capability of the neck: (1) constructing an inverted residual self-attention module as a backbone plugin to capture the long-distance dependencies of disease spots on the leaves; and (2) developing a channel–spatial attention module with residual connection in the neck network to enhance the contextual semantic information of fused features in disease images and eliminate complex background noise. For the second step, the proposed channel–spatial attention module uses Residual Channel Attention (RCA) to enhance inter-channel interactions, facilitating discrimination between disease spots and normal leaf regions, and employs spatial attention (SA) to enhance essential areas of tea diseases. Experimental results demonstrate that the proposed method achieved accuracy and mAP scores of 92.9% and 94.6%, respectively. In particular, this method demonstrated improvements of 6.4% in accuracy and 6.2% in mAP compared to the SSD model. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
33. 多尺度特征融合的机场跑道异物检测与识别算法.
- Author
-
郭晓静 and 邹松林
- Abstract
Copyright of Journal of Computer Engineering & Applications is the property of Beijing Journal of Computer Engineering & Applications Journal Co Ltd. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
34. Using augmented reality to assess spatial neglect: The Free Exploration Test (FET).
- Author
-
Stammler, Britta, Lambert, Marian, Schuster, Thomas, Flammer, Kathrin, and Karnath, Hans-Otto
- Subjects
- *
UNILATERAL neglect , *BISECTORS (Geometry) , *AUGMENTED reality , *VIRTUAL reality , *CAMERAS - Abstract
Background: To capture the distortion of exploratory activity typical of patients with spatial neglect, traditional diagnostic methods and new virtual reality applications use confined workspaces that limit patients' exploration behavior to a predefined area. Our aim was to overcome these limitations and enable the recording of patients' biased activity in real, unconfined space. Methods: We developed the Free Exploration Test (FET) based on augmented reality technology. Using a live stream via the back camera on a tablet, patients search for a (non-existent) virtual target in their environment, while their exploration movements are recorded for 30 s. We tested 20 neglect patients and 20 healthy participants and compared the performance of the FET with traditional neglect tests. Results: In contrast to controls, neglect patients exhibited a significant rightward bias in exploratory movements. The FET had a high discriminative power (area under the curve = 0.89) and correlated positively with traditional tests of spatial neglect (Letter Cancellation, Bells Test, Copying Task, Line Bisection). An optimal cut-off point of the averaged bias of exploratory activity was at 9.0° on the right; it distinguished neglect patients from controls with 85% sensitivity. Discussion: FET offers time-efficient (execution time: ∼3 min), easy-to-apply, and gamified assessment of free exploratory activity. It supplements traditional neglect tests, providing unrestricted recording of exploration in the real, unconfined space surrounding the patient. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
35. Attention and transformer complementary fusion network for hyperspectral image spectral reconstruction.
- Author
-
Zou, Changwu, Zhang, Can, and Zou, Changzhong
- Subjects
- *
IMAGE reconstruction , *SPECTRAL imaging , *DEEP learning , *IMAGE processing , *MULTISPECTRAL imaging , *HYPERSPECTRAL imaging systems , *FEATURE extraction , *PITFALL traps - Abstract
Hyperspectral images can be widely used in many fields due to their high information richness. However, costly and complex imaging spectrometers limit its growth. Hyperspectral reconstruction aims to obtain the corresponding hyperspectral image from the multispectral image, to reduce the acquisition cost of hyperspectral images. At present, the related works mainly use methods based on deep learning, and they have achieved good results. However, how to fully extract the global spectral and spatial features of hyperspectral images is still not well solved. To address these issues, we propose an Attention and Transformer Complementary Fusion Network (ATCFNet), which is composed of three modules: Multi-angle Input Image Processing (MIIP), Deep Feature Extraction (DFE) and Hyperspectral Reconstruction (HR) modules. Within the DFE module, an improved Transformer module and a novel Multi-scale Spatial Attention (MSA) module are proposed to extract the global spectral relationship and the spatial features of the hyperspectral images, respectively. Moreover, the MIIP module is proposed to extract the features of input multispectral images more effectively and comprehensively. To verify these, we compare the proposed ATCFNet with other excellent reconstruction methods on three hyperspectral image datasets. The results show that our method achieves the best results in these datasets. Code is publicly available at . [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
36. Investigating mechanisms of the attentional repulsion effect: A diffusion model analysis.
- Author
-
Rushton, Jayce D., Lawrence, Rebecca K., and Sewell, David K.
- Subjects
- *
SPACE perception , *LATENT variables , *CELL physiology , *DECISION making , *ATTENTION - Abstract
This article investigates the decisional and attentional drivers of the attentional repulsion effect (ARE) using the diffusion decision model (DDM). The ARE is a phenomenon in which a subjective expansion of space is experienced outside the focus of attention. It is thought to occur due to changes in the functioning of visual cell receptive fields. The DDM is a model of the decision-making process that assumes responses are selected by sequentially sampling an encoded representation of a stimulus until sufficient evidence has been accumulated favoring one response alternative over the other. The model decomposes observed choice and response times into different latent variables corresponding to the rate of evidence accumulation, response caution, response bias, and the time course of stimulus encoding and response execution. In this article, we interpret changes in the rate of evidence accumulation as primarily reflecting perceptual-driven changes in stimulus representation. We interpret changes in response bias as primarily reflecting decision-level changes. We utilize the DDM's ability to estimate these variables independently to explore how they are each affected by cueing manipulations to clarify whether the ARE emerges due to attentional or decisional drivers, or some combination of the two. The results of this study could shed light on the mechanisms underlying the ARE, and has implications in our understanding of spatial attention. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
37. LAW-IFF Net: A semantic segmentation method for recognition of marine current turbine blade attachments under blurry edges.
- Author
-
Qi, Fei, Wang, Tianzhen, Wang, Xiaohang, and Chen, Lisu
- Abstract
Challenges exist in the power generation efficiency and safety of marine current turbines (MCTs), as the MCT blades are often attached by foreign objects when operating underwater. It is essential for the stable operation of an MCT to recognize attachments timely and accurately. However, underwater imaging suffers from blurry edges due to light attenuation and scattering. It is challenging for accurate recognition through underwater images since blurry edges result in unclear edge features. To alleviate this problem, LAW-IFF Net is proposed in this paper, which mainly contains two parts. Firstly, this paper proposes to transform the local averages of feature maps into weight matrices, namely the locally average weighting (LAW) mechanism. It is intended to alleviate the edge gradient reduction caused by blurry edges. Secondly, the proposed improved feature fusion (IFF) mechanism aims to overcome the deviation caused by the feature fusion of different attention branches based on spatial attention. At the same time, the lightweight networks are combined with the proposed method to improve the computation speed and ensure the timeliness of recognition. Experimental results on the MCT dataset show the superiority of the proposed method in terms of accuracy and speed of attachment recognition in images with blurry edges. The experimental results on publicly available datasets show the applicability of the proposed method to other underwater images. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
38. ELFNet: An Effective Electricity Load Forecasting Model Based on a Deep Convolutional Neural Network with a Double-Attention Mechanism.
- Author
-
Zhao, Pei, Ling, Guang, and Song, Xiangxiang
- Subjects
CONVOLUTIONAL neural networks ,ARTIFICIAL neural networks ,ELECTRICAL load ,ELECTRIC power consumption ,ENERGY consumption ,DEEP learning - Abstract
Forecasting energy demand is critical to ensure the steady operation of the power system. However, present approaches to estimating power load are still unsatisfactory in terms of accuracy, precision, and efficiency. In this paper, we propose a novel method, named ELFNet, for estimating short-term electricity consumption, based on the deep convolutional neural network model with a double-attention mechanism. The Gramian Angular Field method is utilized to convert electrical load time series into 2D image data for input into the proposed model. The prediction accuracy is greatly improved through the use of a convolutional neural network to extract the intrinsic characteristics from the input data, along with channel attention and spatial attention modules, to enhance the crucial features and suppress the irrelevant ones. The present ELFNet method is compared to several classic deep learning networks across different prediction horizons using publicly available data on real power demands from the Belgian grid firm Elia. The results show that the suggested approach is competitive and effective for short-term power load forecasting. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
39. Enhancing rhythmic temporal expectations: The dominance of auditory modality under spatial uncertainty.
- Author
-
Attout, Lucie, Capizzi, Mariagrazia, and Charras, Pom
- Abstract
To effectively process the most relevant information, the brain anticipates the optimal timing for allocating attentional resources. Behavior can be optimized by automatically aligning attention with external rhythmic structures, whether visual or auditory. Although the auditory modality is known for its efficacy in representing temporal information, the current body of research has not conclusively determined whether visual or auditory rhythmic presentations have a definitive advantage in entraining temporal attention. The present study directly examined the effects of auditory and visual rhythmic cues on the discrimination of visual targets in Experiment 1 and on auditory targets in Experiment 2. Additionally, the role of endogenous spatial attention was also considered. When and where the target was the most likely to occur were cued by unimodal (visual or auditory) and bimodal (audiovisual) signals. A sequence of salient events was employed to elicit rhythm-based temporal expectations and a symbolic predictive cue served to orient spatial attention. The results suggest a superiority of auditory over visual rhythms, irrespective of spatial attention, whether the spatial cue and rhythm converge or not (unimodal or bimodal), and regardless of the target modality (visual or auditory). These findings are discussed in terms of a modality-specific rhythmic orienting, while considering a single, supramodal system operating in a top-down manner for endogenous spatial attention. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
40. Peripheral vision contributes to implicit attentional learning: Findings from the "mouse-eye" paradigm.
- Author
-
Chen, Chen and Lee, Vanessa G.
- Abstract
The central visual field is essential for activities like reading and face recognition. However, the impact of peripheral vision loss on daily activities is profound. While the importance of central vision is well established, the contribution of peripheral vision to spatial attention is less clear. In this study, we introduced a "mouse-eye" method as an alternative to traditional gaze-contingent eye tracking. We found that even in tasks requiring central vision, peripheral vision contributes to implicit attentional learning. Participants searched for a T among Ls, with the T appearing more often in one visual quadrant. Earlier studies showed that participants' awareness of the T location probability was not essential for their ability to learn. When we limited the visible area around the mouse cursor, only participants aware of the target's location probability showed learning; those unaware did not. Adding placeholders in the periphery did not restore implicit attentional learning. A control experiment showed that when participants were allowed to see all items while searching and moving the mouse to reveal the target's color, both aware and unaware participants acquired location probability learning. Our results underscore the importance of peripheral vision in implicitly guided attention. Without peripheral vision, only explicit, but not implicit, attentional learning prevails. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
41. Intermixed levels of visual search difficulty produce asymmetric probability learning.
- Author
-
Won, Bo-Yeong and Leber, Andrew B.
- Abstract
When performing novel tasks, we often apply the rules we have learned from previous, similar tasks. Knowing when to generalize previous knowledge, however, is a complex challenge. In this study, we investigated the properties of learning generalization in a visual search task, focusing on the role of search difficulty. We used a spatial probability learning paradigm in which individuals learn to prioritize their search toward the locations where a target appears more often (i.e., high-probable location) than others (i.e., low-probable location) in a search display. In the first experiment, during a training phase, we intermixed the easy and difficult search trials within blocks, and each was respectively paired with a distinct high-probable location. Then, during a testing phase, we removed the probability manipulation and assessed any generalization of spatial biases to a novel, intermediate difficulty task. Results showed that, as training progressed, the easy search evoked a stronger spatial bias to its high-probable location than the difficult search. Moreover, there was greater generalization of the easy search learning than difficult search learning at test, revealed by a stronger bias toward the former's high-probable location. Two additional experiments ruled out alternatives that learning during difficult search itself is weak and learning during easy search specifically weakens learning of the difficult search. Overall, the results demonstrate that easy search interferes with difficult search learning and generalizability when the two levels of search difficulty are intermixed. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
42. The who and the where: Attention to identities and locations in groups.
- Author
-
Ma, Helen L., Redden, Ralph S., and Hayward, Dana A.
- Abstract
While it is widely accepted that the single gaze of another person elicits shifts of attention, there is limited work on the effects of multiple gazes on attention, despite real-world social cues often occurring in groups. Further, less is known regarding the role of unequal reliability of varying social and nonsocial information on attention. We addressed these gaps by employing a variant of the gaze cueing paradigm, simultaneously presenting participants with three faces. Block-wise, we manipulated whether one face (Identity condition) or one location (Location condition) contained a gaze cue entirely predictive of target location; all other cues were uninformative. Across trials, we manipulated the number of valid cues (number of faces gazing at target). We examined whether these two types of information (Identity vs. Location) were learned at a similar rate by statistically modelling cueing effects by trial count. Preregistered analyses returned no evidence for an interaction between condition, number of valid faces, and presence of the predictive element, indicating type of information did not affect participants' ability to employ the predictive element to alter behaviour. Exploratory analyses demonstrated (i) response times (RT) decreased faster across trials for the Identity compared with Location condition, with greater decreases when the predictive element was present versus absent, (ii) RTs decreased across trials for the Location condition only when it was completed first, and (iii) social competence altered RTs across conditions and trial number. Our work demonstrates a nuanced relationship between cue utility, condition type, and social competence on group cueing. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
43. Fourier Ptychographic Microscopy Reconstruction Method Based on Residual Local Mixture Network.
- Author
-
Wang, Yan, Wang, Yongshan, Li, Jie, and Wang, Xiaoli
- Subjects
- *
FOURIER transform optics , *MICROSCOPY , *IMAGE reconstruction - Abstract
Fourier Ptychographic Microscopy (FPM) is a microscopy imaging technique based on optical principles. It employs Fourier optics to separate and combine different optical information from a sample. However, noise introduced during the imaging process often results in poor resolution of the reconstructed image. This article has designed an approach based on a residual local mixture network to improve the quality of Fourier ptychographic reconstruction images. By incorporating channel attention and spatial attention into the FPM reconstruction process, the network enhances the efficiency of the network reconstruction and reduces the reconstruction time. Additionally, the introduction of the Gaussian diffusion model further reduces coherent artifacts and improves image reconstruction quality. Comparative experimental results indicate that this network achieves better reconstruction quality, and outperforming existing methods in both subjective observation and objective quantitative evaluation. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
44. Auditory vigilance task performance and cerebral hemodynamics: effects of spatial uncertainty.
- Author
-
Hess, Lucas J. and Greenlee, Eric T.
- Subjects
- *
TASK performance , *CEREBRAL circulation , *TRANSCRANIAL Doppler ultrasonography , *HEMODYNAMICS , *AUDITORY perception , *PRESBYCUSIS - Abstract
The vigilance decrement, a temporal decline in detection performance, has been observed across multiple sensory modalities. Spatial uncertainty about the location of task-relevant stimuli has been demonstrated to increase the demands of vigilance and increase the severity of the vigilance decrement when attending to visual displays. The current study investigated whether spatial uncertainty also increases the severity of the vigilance decrement and task demands when an auditory display is used. Individuals monitored an auditory display to detect critical signals that were shorter in duration than non-target stimuli. These auditory stimuli were presented in either a consistent, predictable pattern that alternated sound presentation from left to right (spatial certainty) or an inconsistent, unpredictable pattern that randomly presented sounds from the left or right (spatial uncertainty). Cerebral blood flow velocity (CBFV) was measured to assess the neurophysiological demands of the task. A decline in performance and CBFV was observed in both the spatially certain and spatially uncertain conditions, suggesting that spatial auditory vigilance tasks are demanding and can result in a vigilance decrement. Spatial uncertainty resulted in a more severe vigilance decrement in correct detections compared to spatial certainty. Reduced right-hemispheric CBFV was also observed during spatial uncertainty compared to spatial certainty. Together, these results suggest that auditory spatial uncertainty hindered performance and required greater attentional demands compared to spatial certainty. These results concur with previous research showing the negative impact of spatial uncertainty in visual vigilance tasks, but the current results contrast recent research showing no effect of spatial uncertainty on tactile vigilance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
45. A Side-Lobe Denoise Process for ISAR Imaging Applications: Combined Fast Clean and Spatial Focus Technique.
- Author
-
Xv, Jia-Hua, Zhang, Xiao-Kuan, Zong, Bin-feng, and Zheng, Shu-Yu
- Subjects
- *
INVERSE synthetic aperture radar , *AUTOMATIC target recognition , *DEEP learning , *MULTISPECTRAL imaging - Abstract
The presence of side-lobe noise degrades the image quality and adversely affects the performance of inverse synthetic aperture radar (ISAR) image understanding applications, such as automatic target recognition (ATR), target detection, etc. However, methods reliant on data processing, such as windowing, inevitably encounter resolution reduction, and current deep learning approaches under-appreciate the sparsity inherent in ISAR images. Taking the above analysis into consideration, a convolutional neural network-based process for ISAR side-lobe noise training is proposed in this paper. The proposed processing, based on the ISAR images sparsity characteristic analysis, undergoes enhancements in three core ideas, dataset construction, prior network design, and loss function improvements. In the realm of dataset construction, we introduce a bin-by-bin fast clean algorithm and accelerate the processing speed significantly on the basis of image complete information. Subsequently, a spatial attention layer is incorporated into the prior network designed to augment the network's focus on the crucial regions of ISAR images. In addition, a loss function featuring a weighting factor is devised to ensure the precise recovery of the strong scattering point. Simulation experiments demonstrate that the proposed process achieves significant improvements in both quantitative and qualitative results over the classical denoise methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
46. The Visual Systems of Zebrafish.
- Author
-
Baier, Herwig and Scott, Ethan K.
- Subjects
- *
RETINAL ganglion cells , *BEHAVIORAL neuroscience , *MATCHED filters , *OPTICAL flow , *MOTOR vehicle driving - Abstract
The zebrafish visual system has become a paradigmatic preparation for behavioral and systems neuroscience. Around 40 types of retinal ganglion cells (RGCs) serve as matched filters for stimulus features, including light, optic flow, prey, and objects on a collision course. RGCs distribute their signals via axon collaterals to 12 retinorecipient areas in forebrain and midbrain. The major visuomotor hub, the optic tectum, harbors nine RGC input layers that combine information on multiple features. The retinotopic map in the tectum is locally adapted to visual scene statistics and visual subfield–specific behavioral demands. Tectal projections to premotor centers are topographically organized according to behavioral commands. The known connectivity in more than 20 processing streams allows us to dissect the cellular basis of elementary perceptual and cognitive functions. Visually evoked responses, such as prey capture or loom avoidance, are controlled by dedicated multistation pathways that—at least in the larva—resemble labeled lines. This architecture serves the neuronal code's purpose of driving adaptive behavior. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
47. An Enhanced GAN for Image Generation.
- Author
-
Tian, Chunwei, Gao, Haoyang, Wang, Pengwei, and Zhang, Bob
- Subjects
GENERATIVE adversarial networks ,PERCEPTUAL learning ,INFORMATION networks - Abstract
Generative adversarial networks (GANs) with gaming abilities have been widely applied in image generation. However, gamistic generators and discriminators may reduce the robustness of the obtained GANs in image generation under varying scenes. Enhancing the relation of hierarchical information in a generation network and enlarging differences of different network architectures can facilitate more structural information to improve the generation effect for image generation. In this paper, we propose an enhanced GAN via improving a generator for image generation (EIGGAN). EIGGAN applies a spatial attention to a generator to extract salient information to enhance the truthfulness of the generated images. Taking into relation the context account, parallel residual operations are fused into a generation network to extract more structural information from the different layers. Finally, a mixed loss function in a GAN is exploited to make a tradeoff between speed and accuracy to generate more realistic images. Experimental results show that the proposed method is superior to popular methods, i.e. Wasserstein GAN with gradient penalty (WGAN-GP) in terms of many indexes, i.e. Frechet Inception Distance, Learned Perceptual Image Patch Similarity, Multi-Scale Structural Similarity Index Measure, Kernel Inception Distance, Number of Statistically-Different Bins, Inception Score and some visual images for image generation. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
48. An Improved Ningxia Desert Herbaceous Plant Classification Algorithm Based on YOLOv8.
- Author
-
Ma, Hongxing, Sheng, Tielei, Ma, Yun, and Gou, Jianping
- Subjects
- *
PLANT classification , *DESERT plants , *CLASSIFICATION algorithms , *PLANT identification , *PHYTOGEOGRAPHY , *HERBACEOUS plants - Abstract
Wild desert grasslands are characterized by diverse habitats, uneven plant distribution, similarities among plant class, and the presence of plant shadows. However, the existing models for detecting plant species in desert grasslands exhibit low precision, require a large number of parameters, and incur high computational cost, rendering them unsuitable for deployment in plant recognition scenarios within these environments. To address these challenges, this paper proposes a lightweight and fast plant species detection system, termed YOLOv8s-KDT, tailored for complex desert grassland environments. Firstly, the model introduces a dynamic convolutional KernelWarehouse method to reduce the dimensionality of convolutional kernels and increase their number, thus achieving a better balance between parameter efficiency and representation ability. Secondly, the model incorporates triplet attention into its feature extraction network, effectively capturing the relationship between channel and spatial position and enhancing the model's feature extraction capabilities. Finally, the introduction of a dynamic detection head tackles the issue related to target detection head and attention non-uniformity, thus improving the representation of the target detection head while reducing computational cost. The experimental results demonstrate that the upgraded YOLOv8s-KDT model can rapidly and effectively identify desert grassland plants. Compared to the original model, FLOPs decreased by 50.8%, accuracy improved by 4.5%, and mAP increased by 5.6%. Currently, the YOLOv8s-KDT model is deployed in the mobile plant identification APP of Ningxia desert grassland and the fixed-point ecological information observation platform. It facilitates the investigation of desert grassland vegetation distribution across the entire Ningxia region as well as long-term observation and tracking of plant ecological information in specific areas, such as Dashuikeng, Huangji Field, and Hongsibu in Ningxia. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
49. Attention spotlight in V1-based cortico-cortical interactions in human visual hierarchy.
- Author
-
Zhang, Yanyu, Zhang, Xilin, Lu, Xincheng, and Chen, Nihong
- Subjects
- *
SOCIAL interaction , *ZOOM lenses , *ATTENTION , *CAUSAL models , *MENTAL foramen , *VISUAL cortex , *FUNCTIONAL magnetic resonance imaging - Abstract
Attention is often viewed as a mental spotlight, which can be scaled like a zoom lens at specific spatial locations and features a center-surround gradient. Here, we demonstrate a neural signature of attention spotlight in signal transmission along the visual hierarchy. fMRI background connectivity analysis was performed between retinotopic V1 and downstream areas to characterize the spatial distribution of inter-areal interaction under two attentional states. We found that, compared to diffused attention, focal attention sharpened the spatial gradient in the strength of the background connectivity. Dynamic causal modeling analysis further revealed the effect of attention in both the feedback and feedforward connectivity between V1 and extrastriate cortex. In a context which induced a strong effect of crowding, the effect of attention in the background connectivity profile diminished. Our findings reveal a context-dependent attention prioritization in information transmission via modulating the recurrent processing across the early stages in human visual cortex. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
50. Sustained bias of spatial attention in a 3 T MRI scanner.
- Author
-
Smaczny, Stefan, Behle, Leonie, Kuppe, Sara, Karnath, Hans-Otto, and Lindner, Axel
- Subjects
- *
ATTENTIONAL bias , *SCANNING systems , *UNILATERAL neglect , *MAGNETIC resonance imaging , *VISUAL perception , *EYE movements , *VESTIBULAR stimulation - Abstract
When lying inside a MRI scanner and even in the absence of any motion, the static magnetic field of MRI scanners induces a magneto-hydrodynamic stimulation of subjects' vestibular organ (MVS). MVS thereby not only causes a horizontal vestibular nystagmus but also induces a horizontal bias in spatial attention. In this study, we aimed to determine the time course of MVS-induced biases in both VOR and spatial attention inside a 3 T MRI-scanner as well as their respective aftereffects after participants left the scanner. Eye movements and overt spatial attention in a visual search task were assessed in healthy volunteers before, during, and after a one-hour MVS period. All participants exhibited a VOR inside the scanner, which declined over time but never vanished completely. Importantly, there was also an MVS-induced horizontal bias in spatial attention and exploration, which persisted throughout the entire hour within the scanner. Upon exiting the scanner, we observed aftereffects in the opposite direction manifested in both the VOR and in spatial attention, which were statistically no longer detectable after 7 min. Sustained MVS effects on spatial attention have important implications for the design and interpretation of fMRI-studies and for the development of therapeutic interventions counteracting spatial neglect. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.