21,530 results
Search Results
352. GSA-GAN: Global Spatial Attention Generative Adversarial Networks.
- Author
-
An, Lei, Zhao, Jiajia, and Ma, Bo
- Subjects
- *
GENERATIVE adversarial networks , *INFRARED imaging - Abstract
This paper proposes a solution to translating the visible images into infrared images, which is challenging in computer vision. Our solution belongs to unsupervised learning, which has recently become popular in image-to-image translation. However, existing methods do not produce satisfactory results because (1) most existing methods are mainly used in entertainment scenarios with single scenes and low complexity. The problem solved by this article is more diverse and more complicated. (2) The infrared response of objects depends not only on itself but also on the current environment, and existing methods cannot correlate long-range dependent objects. In this paper, We propose Global Spatial Attention (GSA), which enhances dependence between long-range objects and improves the synthesized image quality. Compared with other methods, GSA can save more space and time. Moreover, we introduce the idea of subspace learning into the neural network to make training more stable. Our method takes unpaired visible images and infrared images for training, which are easy to collect. Experimental results show that our method can generate high-quality infrared images from visible images and outperforms state-of-the-art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
353. A multilevel fusion network for 3D object detection.
- Author
-
Xia, Chunlong, Wei, Ping, Wei, Wenwen, and Zheng, Nanning
- Subjects
- *
ADAPTIVE filters , *SIGNAL convolution , *ROBOTICS - Abstract
[Display omitted] • This paper addresses the problem of 3D object detection in RGB-D images. It is an important and challenging problem in many vision, robotics, and human–machine interaction applications. • It proposes a multilevel fusion network (MFN) model. It presents a new loss function to describe the geometric attribute difference in 3D object bounding boxes and an adaptive depth image filtering algorithm to restore and correct noisy depth images. • It tests the proposed model on challenging datasets and the results outperform the comparison approaches. It also carried out the ablation studies which prove the effectiveness of different modules in the proposed model. 3D object detection is an important yet challenging problem in a myriad of vision, robotics, and human–machine interaction applications. Given an RGB-D image, the task is to infer the class labels and the 3D bounding boxes of the objects in the image. While the previous studies have made remarkable progress over the past decade, how to effectively exploit the feature fusion with neural networks for boosting 3D object detection performance remains an open problem. This paper proposes a multilevel fusion network (MFN) model to detect 3D objects in RGB-D images. The MFN model contains two streams of neural networks which respectively extracts the RGB and depth features with cascaded convolutional modules. To effectively exploit the information of 3D objects, a multilevel fusion mechanism is adopted to fuse the convolutional RGB and depth features at multiple levels. To train the network, we propose a new weighted loss function by encoding the difference of geometric attributes on 3D bounding box regression. Since the original depth data is full of noisy holes, we also develop an adaptive filtering algorithm to restore and correct the depth images. We test the proposed model on challenging RGB-D datasets. The experimental results on the datasets prove the strength and advantage of the proposed model. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
354. Learning to detect anomaly events in crowd scenes from synthetic data.
- Author
-
Lin, Wei, Gao, Junyu, Wang, Qi, and Li, Xuelong
- Subjects
- *
ANOMALY detection (Computer security) , *PUBLIC safety , *CROWDS , *HUMAN behavior , *BEHAVIORAL assessment - Abstract
Recently, due to its widespread applications in public safety, anomaly detection in crowd scenes has become a hot topic. Some deep-learning-based methods attain significant achievements in this field. Nevertheless, most of them suffer from over-fitting to some extent because of scarce data, which are usually abrupt and low-frequency in the real world. To remedy the above problem, this paper firstly develops a synthetic anomaly event generating system, which could simulate typical specific abnormal events. By utilizing this system, a large synthetic, diverse anomaly event dataset is built, which contains 2,149 video sequences. After getting the dataset, a 3D CNN is designed to detect the abnormal types at the video level. However, we find that there are obvious domain differences (also named as "domain gap/shifts") between synthetic videos and real-world data, which results in performance degradation when applying the model to the real world. Thus, this paper further proposes a cyclic 3D GAN for domain adaption to reduce the domain gap, which translates the synthetic data to the photorealistic video sequences. Then the detection model is trained on the translated data and it can perform well in the real data. Experimental results illustrate that the proposed method outperforms these baselines for the domain adaptation anomaly detection. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
355. Multi-label thresholding for cost-sensitive classification.
- Author
-
Alotaibi, Reem and Flach, Peter
- Subjects
- *
LABELS , *SCATTER diagrams , *ERROR rates , *CLASSIFICATION algorithms , *CLASSIFICATION , *THRESHOLDING algorithms - Abstract
Multi-label classification associates each instance with a set of labels which reflects the nature of a wide range of real-world applications. However, existing approaches assume that all labels have the same misclassification cost, whereas in real-world problems different types of misclassification errors have different costs, which are generally unknown in the training context or might change from one context to another. Thus, there is a demand for cost-sensitive classification methods that minimise the average misclassification cost rather than error rates or counts. In this paper, we adopt a simple yet general method, called thresholding, which applies to most classification algorithms to adapt them to cost-sensitive multi-label classification. This paper investigates current threshold choice approaches for multi-label classification. It explores the choice of single and multiple thresholds and extends some of the current techniques to support multi-label problems. Moreover, it proposes cost curves and scatter diagrams for performance evaluation in the multi-label setting. Experimental evaluation on 13 multi-label datasets demonstrates that there is no significant loss by adjusting a global threshold rather than a per-label threshold considering different misclassification costs across labels. Although tuning multiple thresholds is the obvious solution, the global threshold can also be valid. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
356. Adaptive fuzzy output feedback FTC for nonstrict-feedback systems with sensor faults and dead zone input.
- Author
-
Zhang, Jun and Tong, Shaocheng
- Subjects
- *
FAULT zones , *PSYCHOLOGICAL feedback , *ADAPTIVE fuzzy control , *UNCERTAIN systems , *DETECTORS , *FUZZY logic , *ARTIFICIAL pancreases - Abstract
This paper solves the fault-tolerant control (FTC) problem of uncertain nonlinear systems in nonstrict-feedback form. The controlled plants considered in this paper are more complicated than existing ones, which is composed of the unknown nonlinearities, unmeasured states, and sensor faults and unknown dead zone input nonlinearity. Fuzzy logic systems and a fault-estimation-based state observer are used for identifying the unknown nonlinearities and obtaining the unmeasured states, respectively. By designing the sensor faults compensation and the dead zone inverse compensation methods, an observer-based fault-tolerant compensation control scheme is developed under backstepping recursive design frame. And it is testified that the closed-loop signals are bounded, and the tracking performance is satisfied even when the controlled systems are not free of sensor faults and unknown dead zone input nonlinearity. An electromechanical system is used to test the effectiveness of the developed control strategy. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
357. Android malware detection through machine learning on kernel task structures.
- Author
-
Wang, Xinning and Li, Chong
- Subjects
- *
MACHINE learning , *MALWARE , *MALWARE prevention , *DATA structures , *SMARTPHONES , *DATA warehousing - Abstract
With the advent of smart phones, the popularity of free Android applications has risen rapidly. This has led to malicious Android apps being involuntarily installed, which violate the user privacy or conduct attack. Malware detection on Android platforms therefore is a growing concern because of the undesirable similarity between malicious behavior and benign behavior, which can lead to slow detection, and allow compromises to persist for comparatively long periods of time in infected phones. The contributions of this paper are first a multiple dimensional, kernel feature-based framework and feature weight-based detection (WBD) designed to categorize and comprehend the characteristics of Android malware and benign apps. Furthermore, our software agent is orchestrated and implemented for the data collection and storage to scan thousands of benign and malicious apps automatically. We examine 112 kernel attributes of executing the task data structure in the Android system and evaluate the detection accuracy with a number of datasets of various dimensions. We find that memory- and signal-related features contribute to more precise classification than schedule-related and other descriptors of task states listed in our paper. Particularly, memory-related features provide fine-grain classification policies for preserving higher classification precision than the signal-related and others. Furthermore, we study and evaluate 80 newly infected attributes of the Android kernel task structure, prioritizing the 70 features of most significance based on dimensional reduction to optimize the efficiency of high-dimensional classification. Our second contribution is that our experiments demonstrate that, as compared to existing techniques with a short list of task structure features (16 or 32 features), our method can achieve 94%-98% accuracy and 2%–7% false positive rate, while detecting malware apps with reduced-dimensional features that adequately abbreviate online malware detections and advance offline malware inspections. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
358. Soft sensor based on eXtreme gradient boosting and bidirectional converted gates long short-term memory self-attention network.
- Author
-
Zhu, Xiuli, Hao, Kuangrong, Xie, Ruimin, and Huang, Biao
- Subjects
- *
INTRINSIC viscosity , *DETECTORS , *DECISION trees , *COMPUTATIONAL complexity , *INFORMATION processing - Abstract
• The proposed Xgboost-BiCG-LSTM-SEA algorithm has higher prediction accuracy and a more stable prediction compared with others. • The Xgboost is first utilized to act as an encoder to weigh the selected input variables based on feature relevance. • A bidirectional converted gates LSTM (BiCG-LSTMs) algorithm is presented in this paper to reduce the computational complexity of LSTMs and then applied for soft sensor. In this paper, a new soft sensor that combines eXtreme Gradient Boosting (Xgboost) decision trees and a bidirectional, converted gate long short-term memory (BiCG-LSTMs) self-attention (SEA) mechanism network is proposed. Xgboost is first utilized to select relevant input variables according to their importance. It then acts as an encoder to weigh the selected input variables based on their importance scores. The encoded input variables are normalized and then sent to the bidirectional converted gates LSTM (BiCG-LSTMs) to extract dynamic information hidden in the process data. The BiCG-LSTMs is designed to avoid multiple gates function, a characteristic of traditional LSTM units in bidirectional LSTM that consumes additional calculation time. Next, a regularization method by smoothing dynamic features based on self-attention weights is utilized to denoise and alleviate the overfitting of the regression once new features are added. In addition, self-attention takes into account the internal dependence of input variables regardless how far the distance between input variables. Finally, the effectiveness of the proposed Xgboost-BiCG-LSTM-SEA soft sensor framework is demonstrated by an application to the prediction of melt intrinsic viscosity of the polyester polymerization process. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
359. Predicting short-term next-active-object through visual attention and hand position.
- Author
-
Jiang, Jingjing, Nan, Zhixiong, Chen, Hui, Chen, Shitao, and Zheng, Nanning
- Subjects
- *
ARTIFICIAL neural networks , *HUMAN-robot interaction , *HAND , *DISTRIBUTION (Probability theory) , *MACHINE learning - Abstract
Human intention prediction is of great significance in many applications, such as human-robot interaction, intelligent rehabilitation robots. This paper studies the problem of short-term next-active-object prediction in egocentric images. The short-term next-active-object refers to the object that a human is going to interact with in the short-term future, which is an embodiment of human intention. Most current methods usually use object-centered cues, such as the deviation of object appearance change and the unique shape of the egocentric object trajectory, to predict the next-active-object. In this paper, inspired by the fact that human intention is also revealed by human-centered cues, we propose a deep neural network model that integrates the cues from visual attention and hand positions to predict the next-active-object. Firstly, the probability maps of visual attention and hand positions are constructed, and then the probability distribution of next-active-object is generated. We experimentally compare our method with several baseline methods using two datasets and confirm its effectiveness. In addition, ablation experiments are conducted, and crucial points concerning the next-active-object are discussed. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
360. Containment control of general linear multi-agent systems by event-triggered control mechanisms.
- Author
-
zhang, juan, Zhang, Huaguang, Cai, Yuliang, and Li, Weihua
- Subjects
- *
MULTIAGENT systems , *LINEAR systems , *STATE feedback (Feedback control systems) - Abstract
This paper discusses the containment control (CC) problem of general linear multi-agent systems (MAS)s by means of two kinds of distributed event-triggered mechanisms. Two types of event-triggered control protocols, namely, the state feedback control law and the dynamic output feedback control protocol, are designed for each follower. Under the proposed control protocols and triggering mechanisms, the containment control problem can be solved by proving that the containment error converges to zero based on the assumption and algorithms. At last, we verify the rationality of the theoretical results. Through two numerical simulation, we can see that the trajectory of each follower converges to the convex hull formed by all leaders. In addition, in order to verify the advantages of the obtained results, we give a simulation example and compare the methods designed in this paper with the methods in other literature. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
361. Meta weight learning via model-agnostic meta-learning.
- Author
-
Xu, Zhixiong, Chen, Xiliang, Tang, Wei, Lai, Jun, and Cao, Lei
- Subjects
- *
MACHINE learning , *REINFORCEMENT learning , *WEIGHT measurement , *WEIGHTS & measures - Abstract
While meta learning approaches have achieved remarkable success, obtaining a stable and unbiased meta -learner remains a significant challenge, since the initial model of a meta -learner could be too biased towards existing tasks to adapt to new tasks. In order to avoid a biased meta -learner and improve its generalizability, this paper proposes a generic meta learning method that aims to learn an unbiased meta -learner towards a variety of tasks before its initial model is adapted to unseen tasks. Specifically, this paper presents a meta weight learning method for minimizing the inequality of performance across different training tasks. An end-to-end training approach is introduced for the proposed algorithm that allows for effectively learning weight and initializing the network model. Alternatively, a variety of measurement methods of weight is also designed to test the effectiveness of different weight learning methods on the improvement of model-agnostic meta -learning algorithm. The simulation results show that the proposed meta weight learning method not only outperforms state-of-the-art meta learning algorithms, but also is superior to other manually designed measurement methods of weight on discrete and continuous control problems. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
362. Passivity and robust passivity of impulsive inertial neural networks with proportional delays under the non-reduced order approach.
- Author
-
Zhang, Jun and Zhu, Song
- Abstract
This paper primarily addresses the problems of passivity and robust passivity of impulsive inertial neural networks (IINNs) with proportional delays. By constructing the Lyapunov functional directly on the original system using the non-reduced order approach, some passivity and robust passivity criteria for IINNs are addressed. In comparison with the order reduction method utilized in the existing articles, the non-reduced order method is more in line with real requirements and can better analyze the dynamic behavior of inertial neural networks (INNs). Meanwhile, the results in this paper are all in algebraic form, which are easy to verify. To demonstrate the effectiveness of the results derived, numerical examples are presented at the end. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
363. Domain-control prompt-driven zero-shot relational triplet extraction.
- Author
-
Xu, Liang, Gao, Changxia, and Tian, Xuetao
- Subjects
- *
LANGUAGE models - Abstract
Zero-shot relational triplet extraction is a vital solution to the problem of fact extracted from unstructured text without labeled training data. In the task, the data is divided into seen and unseen relations for training and prediction, respectively. A strategy that trains a generative model based on seen data first and generates training samples for unseen data has been shown to be effective in solving this task. However, this strategy is severely limited by error propagation caused by generated noisy data. To address this issue, prompts may provide a feasible route since they have been widely utilized in cross-domain tasks. In this paper, three preliminary experiments reveal the effectiveness of prompts for the task of triplet extraction and its internal mechanism. Specifically, the method using prompts can control the domain. 1 1 The term is quite loosely used in NLP. In this paper, domain is used to represent which relations may be associated with the given sentence. For more detailed information, please refer to Appendix A. Further, we propose a simple but effective model for zero-shot relational triplet extraction, which leverages zero-shot text classification to first determine the prompts of unseen relations aiming to optimize both its domain and length, and then extracts triplets via prompt-driven strategy. Extensive experiments are conducted on two public datasets, demonstrating that the proposed model achieves a better performance than baselines. • Prompts are able to control the output domain, guiding the extraction of unseen triplets. • Potential relations can be determined according to semantic matching between sentences and relation description text. • A two-stage framework is proposed to effectively improve the performance of zero-shot relational triplet extraction task. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
364. Learning group-wise spatial attention and label dependencies for multi-task thoracic disease classification.
- Author
-
Xu, Yujia, Lam, Hak-Keung, Bao, Xinqi, and Wang, Yuhan
- Subjects
- *
NOSOLOGY , *X-ray imaging , *PATHOLOGICAL physiology , *DEEP learning , *SOURCE code - Abstract
This paper considers the multi-label thoracic abnormality classification with chest X-ray images. In clinical settings, Chest X-ray imaging is a general diagnostic tool applied to visualize numerous thoracic pathological changes. While deep learning techniques have been extensively tested in this field, certain challenges persist. The data in existing thoracic abnormality datasets is insufficient, and some diseases are extremely imbalanced. Meanwhile, the dependencies between different labels are often ignored. To tackle these issues head-on, this paper introduces two crucial modules: the group-wise spatial attention (GWSA) module and the label co-occurrence dependency (LCD) module, integrated with DenseNet121 backbone. Specifically, GWSA enhances the spatial features within distinct groups while keeping the between-group feature discrimination. LCD models the correlations between different thoracic abnormalities to refine the predicted probabilities. In conjunction with the DenseNet121 backbone, these two modules reach an average AUC score of 0.818 on Chest X-ray14 dataset, achieving state-of-the-art. Source code is available at https://github.com/YujiaKCL/Chest-Xray14-GWSA-LCD. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
365. A survey on vulnerability of federated learning: A learning algorithm perspective.
- Author
-
Xie, Xianghua, Hu, Chen, Ren, Hanchi, and Deng, Jingjing
- Subjects
- *
FEDERATED learning , *MACHINE learning , *DEEP learning , *LITERATURE reviews , *BIBLIOGRAPHY - Abstract
Federated Learning (FL) has emerged as a powerful paradigm for training Machine Learning (ML), particularly Deep Learning (DL) models on multiple devices or servers while maintaining data localized at owners' sites. Without centralizing data, FL holds promise for scenarios where data integrity, privacy and security and are critical. However, this decentralized training process also opens up new avenues for opponents to launch unique attacks, where it has been becoming an urgent need to understand the vulnerabilities and corresponding defense mechanisms from a learning algorithm perspective. This review paper takes a comprehensive look at malicious attacks against FL, categorizing them from new perspectives on attack origins and targets, and providing insights into their methodology and impact. In this survey, we focus on threat models targeting the learning process of FL systems. Based on the source and target of the attack, we categorize existing threat models into four types, Data to Model (D2M), Model to Data (M2D), Model to Model (M2M) and composite attacks. For each attack type, we discuss the defense strategies proposed, highlighting their effectiveness, assumptions and potential areas for improvement. Defense strategies have evolved from using a singular metric to excluding malicious clients, to employing a multifaceted approach examining client models at various phases. In this survey paper, our research indicates that the to-learn data, the learning gradients, and the learned model at different stages all can be manipulated to initiate malicious attacks that range from undermining model performance, reconstructing private local data, and to inserting backdoors. We have also seen these threat are becoming more insidious. While earlier studies typically amplified malicious gradients, recent endeavors subtly alter the least significant weights in local models to bypass defense measures. This literature review provides a holistic understanding of the current FL threat landscape and highlights the importance of developing robust, efficient, and privacy-preserving defenses to ensure the safe and trusted adoption of FL in real-world applications. The categorized bibliography can be found at: https://github.com/Rand2AI/Awesome-Vulnerability-of-Federated-Learning. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
366. Differentially private distributed online optimization via push-sum one-point bandit dual averaging.
- Author
-
Zhao, Zhongyuan, Yang, Ju, Gao, Wang, Wang, Yan, and Wei, Mengli
- Subjects
- *
FEDERATED learning , *DATA privacy , *COST functions , *DIRECTED graphs , *MULTIAGENT systems , *ROBBERS , *INTERNET privacy , *DIFFERENTIAL evolution - Abstract
This paper focuses on the distributed online optimization problem in multi-agent systems considering privacy preservation. Each agent exchanges local information with neighboring agents on the strongly connected time-varying directed graphs. Since the process of information transmission is prone to information leakage, a distributed push-sum dual averaging algorithm based on the differential privacy mechanism is proposed to protect the privacy of the data. In addition, to handle situations where the gradient information of the node cost function is unknown, the one-point gradient estimation is designed to calculate the true gradient information and guide the update of the decision variables. With the appropriate choice of the stepsizes and the exploration parameters, the algorithm can effectively protect the privacy of agents while achieving sublinear regret with the convergence rate O (T 3 4 ). Furthermore, this paper also explores the effect of one-point estimation parameters on the regret in the online setting and investigates the relation between the convergence effect of individual regret and differential privacy levels. Finally, several federated learning experiments were conducted to verify the efficacy of the algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
367. A quantum-enhanced solution method for multi classification problems.
- Author
-
Zhang, Yijun, Mu, Xiaodong, Zhang, Peng, and Zhao, Dao
- Subjects
- *
TIME complexity , *QUANTUM computing , *QUANTUM states , *POLYNOMIAL time algorithms , *KERNEL functions , *CLASSIFICATION algorithms - Abstract
With the increasing data size of multi classification problems, the running efficiency of classical algorithms is seriously affected. In the paper, in order to improve the implementation efficiency of the algorithm, we propose a quantum-enhanced solution method for multi classification problems. The method mainly introduces the quantum-enhanced technology into the classical algorithm. Aiming at the two steps of solving Euclid distance and kernel function in the classical algorithm, the paper relates the classical inner product principle with the amplitude evolution of quantum states. On the basis of quantizing the sample data, a general quantum circuit that can calculate the inner product is designed and constructed. The circuit can make full use of the advantages of quantum parallel computing to achieve exponential acceleration of computing efficiency. Aiming at solving linear equations in the classical algorithm, a quantum circuit based on the quantum singular value estimation is designed and constructed. The circuit makes use of the acceleration advantage of quantum computing in matrix computing to achieve polynomial acceleration of matrix computing. The experimental results show that the method can not only find the optimal solution for multi classification problems, but also greatly improve the operation efficiency of the algorithm. Compared with the classical methods, the method has at least polynomial improvement in time complexity and spatial complexity. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
368. An incremental feature selection approach for dynamic feature variation.
- Author
-
Wang, Feng, Wang, Xinhao, Wei, Wei, and Liang, Jiye
- Subjects
- *
FEATURE selection , *METEOROLOGICAL research , *BIG data , *ALGORITHMS - Abstract
In numerous domains, there is ample evidence indicating that a significant portion of real-world data exhibits temporal variations, such as medical research and meteorological studies. Particularly in the era of big data, not only does the size of data change dynamically but also its dimensions at an unprecedented pace. Consequently, employing traditional methods to handle such dynamic data becomes highly impractical. To address this limitation, this paper proposes an incremental feature selection algorithm tailored for dynamic feature variations. For scenarios involving an increase in features, the novel algorithm efficiently identifies a target subset with effective features. In order to showcase the efficacy of the proposed algorithm, this paper conduct experiments using four commonly used classifiers and five UCI data sets. The experimental results further validate both the feasibility and efficiency of the new approach. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
369. Distributed Nash equilibrium searching for multi-agent games under false data injection attacks.
- Author
-
Lv, Yixuan, Liu, Yan-Jun, Liu, Lei, Yu, Dengxiu, and Chen, Yang
- Subjects
- *
NASH equilibrium , *ADAPTIVE fuzzy control , *MULTIAGENT systems - Abstract
In this paper, for the multi-agent system, the Nash equilibrium (NE) search problem for non-cooperative game with state constraints under false data injection (FDI) attack is studied. The FDI attack can directly destroy the connectivity between agents, resulting in the performance degradation or even failure for most NE search algorithms. We propose a new distributed NE search method to improve the performance of NE algorithms under FDI attacks. The NE algorithm is designed to find the solution of NE under network attack and make the system recover after the attack. However, most of the existing results only studied the NE algorithm of the non-attacking system, and the existing results rarely take into account whether the system state exceeds the safety limit, which can cause the system to crash. This paper solves the state constraints problem of the multi-agent game under the FDI attack. Finally, the reliability of the algorithm is verified by a numerical result. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
370. Input-to-state stability of stochastic complex networks based on aperiodically intermittent sampled control.
- Author
-
Chen, Tianrui and Chen, Jiacai
- Subjects
- *
STABILITY criterion , *GRAPH theory - Abstract
This paper focuses on the problem of input-to-state stability (ISS) of stochastic complex networks (SCNs). In this paper, an aperiodically intermittent control strategy based on sampled control is designed. By means of graph theory and Lyapunov method, two stability criteria on ISS are derived in this paper. After giving the estimate between E ∑ i = 1 m | x i (t) − x i (δ (t)) | 2 and E ∑ i = 1 m | x i (t) | 2 , a stability criterion is proposed on ISS of the SCN under aperiodically intermittent sampled control (AISC). When AISC degenerates into sampled control, another stability criterion on ISS of SCN is acquired. Finally, a numerical example is utilized to illustrate the effectiveness and feasibility of the proposed results. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
371. A survey of the recent trends in deep learning for literature based discovery in the biomedical domain.
- Author
-
Cesario, Eugenio, Comito, Carmela, and Zumpano, Ester
- Subjects
- *
DEEP learning , *NATURAL language processing , *LANGUAGE models , *SCIENTIFIC literature , *SCIENTIFIC knowledge , *MACHINE learning - Abstract
Every day, enormous amounts of biomedical texts discussing various biomedical topics are produced. Revealing strong semantic connections hidden in those unstructured data is essential for many interesting applications such as knowledge base development for the biomedical domain as well as drug repurposing and drug–disease associations. Literature based discovery (LBD) is a well-known paradigm that refers to the issues of finding new hidden knowledge in scientific literature by connecting pieces of semantically-related information belonging to independent documents. This challenging research area has been extensively investigated by the research community and different proposals adopting natural language processing, text mining, machine learning and recently deep learning have been developed. This paper exploits a very focused task, it surveys a collection of research papers published in the recent years that have adopted Deep Learning for literature based discovery as an effective technique to discover new relationships between existing knowledge in biomedical domain. The study provides an analysis of the key characteristics of each work surveyed, including the Literature based discovery application area, the deep learning method used, the type of analyzed data, and the results obtained. Recognizing the significance of Pre-trained Language Models (PLMs), another primary aim of this paper is to offer an extensive overview of the latest developments in pre-trained language models within the field of biomedicine. This focus will primarily be on how they are applied to downstream tasks associated with Literature-Based Discovery in the biomedical domain. Additionally, the survey highlights the key drawbacks of the current state-of-the-art proposals, as well as the challenges that require further study by the research community. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
372. Anchor Ball Regression Model for large-scale 3D skull landmark detection.
- Author
-
He, Tao, Xu, Guikun, Cui, Li, Tang, Wei, Long, Jie, and Guo, Jixiang
- Subjects
- *
REGRESSION analysis , *SKULL , *DEEP learning - Abstract
Recent deep learning models have exhibited impressive performance in the area of 3D skull landmark detection, but most of them aimed to detect a fixed number of landmarks. This paper focuses on automatically detecting an arbitrary number of landmarks on CT volumes, which meets the real clinical needs. To achieve robust performance for detecting arbitrary molar landmarks, we propose a novel 3D landmark detection model named Anchor Ball Regression Model (ABRM), which combines landmark detection and landmark classification losses for network training. For landmark detection, a novel landmark regression loss is proposed by predicting offsets to anchor balls instead of directly predicting landmarks. For landmark classification, an online hard negative mining loss is used to reduce absent landmarks' learning errors, and a small regularization constraint loss is performed for voxels outside the anchor balls. The network backbone of ABRM is obtained by manually pruning popular 3D-CNNs. We also present an available large-scale benchmark dataset in this paper, which, to the best of our knowledge, is the largest dataset for 3D skull landmark detection. The dataset comprises of 658 CT volumes, with 14 landmarks labeled by two junior and one senior doctors. The ABRM presents a good robust performance and outperforms other models on this dataset. The codes and dataset are accessible at https://github.com/ithet1007/mmld_code. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
373. Scene Graph Generation: A comprehensive survey.
- Author
-
Li, Hongsheng, Zhu, Guangming, Zhang, Liang, Jiang, Youliang, Dang, Yixuan, Hou, Haoran, Shen, Peiyi, Zhao, Xia, Shah, Syed Afaq Ali, and Bennamoun, Mohammed
- Subjects
- *
OBJECT recognition (Computer vision) , *DEEP learning - Abstract
Deep learning techniques have led to remarkable breakthroughs in the field of object detection and have spawned a lot of scene-understanding tasks in recent years. Scene graph has been the focus of research because of its powerful semantic representation and applications to scene understanding. Scene Graph Generation (SGG) refers to the task of automatically mapping an image or a video into a semantic structural scene graph, which requires the correct labeling of detected objects and their relationships. In this paper, a comprehensive survey of recent achievements is provided. This survey attempts to connect and systematize the existing visual relationship detection methods, to summarize, and interpret the mechanisms and the strategies of SGG in a comprehensive way. Deep discussions about current existing problems and future research directions are given at last. This survey will help readers to develop a better understanding of the current researches. [Display omitted] • A comprehensive review of 138 papers on scene graph generation is presented. • We analyze 2D scene graph generation, focusing on feature representation. • A review of typical datasets for 2D, spatio-temporal and 3D scene graph generation is presented. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
374. A systematic review of image-level camouflaged object detection with deep learning.
- Author
-
Liang, Yanhua, Qin, Guihe, Sun, Minghui, Wang, Xinchao, Yan, Jie, and Zhang, Zhonghan
- Subjects
- *
DEEP learning , *OBJECT recognition (Computer vision) , *RESEARCH personnel - Abstract
Camouflaged object detection (COD) aims to search and identify disguised objects that are hidden in their surrounding environment, thereby deceiving the human visual system. As an interesting and challenging task, COD has received increasing attention from the community in the past few years, especially for image-level camouflaged object segmentation task. So far, some advanced image-level COD models have been proposed, mainly dominated by deep learning-based solutions. To have an in-depth understanding of existing image-level COD methods in the deep learning era, in this paper, we give a comprehensive review on model structure and paradigm classification, public benchmark datasets, evaluation metrics, model performance benchmark, and potential future development directions. Specifically, we first review 96 existing deep COD algorithms. Subsequently, we summarize and analyze the existing five widely used COD datasets and evaluation metrics. Furthermore, we benchmark a set of representative models and provide a detailed analysis of the comparison results from both quantitative and qualitative perspectives. Moreover, we further discuss the challenges of COD and the corresponding solutions. Finally, based on the understanding of this field, future development trends and potential research directions are prospected. In conclusion, the purpose of this paper is to provide researchers with a review of the latest COD methods, increase their understanding of COD research, and gain some enlightenment. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
375. A review of coverless steganography.
- Author
-
Meng, Laijin, Jiang, Xinghao, and Sun, Tanfeng
- Subjects
- *
CRYPTOGRAPHY , *STATISTICS , *ALGORITHMS - Abstract
With the enhancement of people's security awareness, transmitting secret information securely has gradually become a demand for the public. Steganography is a technology of representing secret information within another carrier, aiming at transmitting secret information without causing suspicion. Most of the traditional steganographic algorithms hide secret information by modifying the statistical characteristics, which will leave traces to the carriers. Although these modifications are too tiny to be distinguished by human eyes, they can be detected by steganalysis algorithms. Differently, coverless steganography, also written as steganography without embedding, accomplish the process of information hiding by constructing the relationship between secret information and carriers. Due to no modification to the carriers, all of the steganalysis algorithms are expired. In this paper, more than 90 papers are included to provide a review in coverless steganographic algorithms, covering the major development process of coverless image and video steganographic algorithms. The main contribution of the existing methods is summarized. Besides, the current general issues of capacity, robustness, and security are discussed adequately for both image and video algorithms. Especially, the security of coverless steganography is discussed for the first time from theoretical analysis to actual investigation in this review. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
376. Attention Round for post-training quantization.
- Author
-
Diao, Huabin, Li, Gongyan, Xu, Shaoyun, Kong, Chao, and Wang, Wei
- Subjects
- *
CONVOLUTIONAL neural networks , *ARTIFICIAL neural networks , *COMBINATORIAL optimization , *GAUSSIAN function , *KHAT - Abstract
Quantization methods for convolutional neural network models can be broadly categorized into post-training quantization (PTQ) and quantization aware training (QAT). While PTQ offers the advantage of requiring only a small portion of the data for quantization, the resulting quantized model may not be as effective as QAT. To address this limitation, this paper proposes a novel quantization function named Attention Round. Unlike traditional quantization function that map 32 bit floating-point value w to nearby quantization levels, Attention Round allows w to be mapped to all possible quantization levels in the entire quantization space, expanding the quantization optimization space. The possibilities of mapping w to different quantization levels are inversely correlated with the distance between w and the quantization levels, regulated by a Gaussian decay function. Furthermore, to tackle the challenge of mixed precision quantization, this paper introduces a lossy coding length measure to assign quantization precision to different layers of the model, eliminating the need for solving a combinatorial optimization problem. Experimental evaluations on various models demonstrate the effectiveness of the proposed method. Notably, for ResNet18 and MobileNetV2, the PTQ approach achieves comparable quantization performance to QAT while utilizing only 1024 training data and 10 min for the quantization process. • Attention Round quantization function expands the quantization optimization space. • Mixed precision allocation method improves mixed precision quantization efficiency. • Enriched lightweight CNNs contribute to applications in resource-limited scenarios. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
377. A review of IoT applications in healthcare.
- Author
-
Li, Chunyan, Wang, Jiaji, Wang, Shuihua, and Zhang, Yudong
- Subjects
- *
INTERNET of things , *HEALTH care industry , *DATA security , *PATIENT monitoring , *MEDICAL care - Abstract
Integrating Internet of Things (IoT) technologies in the healthcare industry represents a transformative shift with tangible benefits. This paper provides a detailed examination of IoT adoption in healthcare, focusing on specific sensor types and communication methods. It underscores successful real-world applications, including remote patient monitoring, individualized treatment strategies, and streamlined healthcare delivery. Furthermore, it delves into the intricate challenges to realizing the full potential of IoT in healthcare. This includes addressing data security concerns, ensuring seamless interoperability, and optimizing the use of IoT-generated data. The paper seeks to inspire practitioners and researchers by highlighting the practical implications of IoT in healthcare, emphasizing the ways IoT can enhance patient care, resource allocation, and overall healthcare efficiency. • More than 103 references from top journals are analyzed. • A systematic theoretical analysis of IoT applications in healthcare is provided. • General works on IoT applications in healthcare are well summarized. • Future challenges of IoT applications in healthcare are analyzed from several standpoints. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
378. A unified view of multi-grade fuzzy-set models in J-CO-QL[formula omitted].
- Author
-
Fosci, Paolo and Psaila, Giuseppe
- Subjects
- *
FUZZY sets , *NONRELATIONAL databases - Abstract
The complexity of reality has driven the evolution of Fuzzy-Set Theory from the initial proposal made by Zadeh in 1965, towards more complex models. Moving from a quick survey of the evolution of Fuzzy-Set Theory, this paper highlights the aspects that are common to many Fuzzy-Set Models, in order to define a meta-model that is capable of providing a unified view to a wide variety of fuzzy-set models. In particular, this work focuses the attention on the family of "Multi-grade Fuzzy Sets", which are fuzzy sets characterized by more than one degree. The lack of tools capable of querying the large amount of data that are nowadays available in NoSQL databases, has pushed us to devise the J-CO Framework: it is a platform-independent tool that is capable to manage, transform and query collections of JSON documents; the J-CO Framework relies on J-CO-QL + , which is a high-level, general-purpose language with soft-querying capabilities. The latest advancements of J-CO-QL + allow for defining and exploiting user-defined Multi-grade Fuzzy-Set Models and Operators. In the paper, a case-study demonstrates the effectiveness of the J-CO Framework in performing a non-trivial soft query based on a Multi-grade Fuzzy-Set Model defined by the user. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
379. Event-triggered critic learning impedance control of lower limb exoskeleton robots in interactive environments.
- Author
-
Sun, Yaohui, Peng, Zhinan, Hu, Jiangping, and Ghosh, Bijoy Kumar
- Subjects
- *
ROBOTIC exoskeletons , *IMPEDANCE control , *REINFORCEMENT learning , *MOBILE robots , *CRITICS - Abstract
In this paper, we present an event-triggered critic learning impedance control algorithm for a lower limb rehabilitation exoskeleton robot in an interactive environment, where the control objective is specified by a desired impedance model. In comparison to many other traditional impedance controller design algorithms, in this paper, the impedance control problem is transformed into an optimal control problem. Firstly, the interactive environment accounts for the interaction between the exoskeleton, the human, and the environment, and is modeled by a linear time-invariant exogenous system. Secondly, in contrast to time-triggered control design mechanisms, the event-triggered controller is updated only when the system states deviate from prescribed threshold values. To obtain the event-triggered optimal controller, a critic neural network is developed through the framework of reinforcement learning. A modified gradient descent method is introduced to update the weights of the critic network with an additional stable term employed to eliminate the need for an initial admissible control. Meanwhile, with the simultaneous application of historical and transient state data to the critic neural network, the persistent excitation conditions are relaxed. The Lyapunov method is used to rigorously demonstrate the stability of the overall system. Finally, the effectiveness of the proposed algorithm is demonstrated via simulation. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
380. Stabilization analysis of incommensurate fractional-order memristor-based neural networks via delay-dependent distributed controller.
- Author
-
Xiao, Shasha, Wang, Zhanshan, and Wang, Qiufu
- Subjects
- *
TIME-varying networks , *INFORMATION storage & retrieval systems , *RECURRENT neural networks , *HOPFIELD networks , *COMPUTER simulation - Abstract
This paper studies the stabilization analysis problem of a class of incommensurate fractional-order memristor-based neural networks with multiple time-varying delays (IFOMNNs-MTDs). In previously published studies of IFOMNNs-MTDs, the controller is usually designed as a general delay-independent feedback controller. Due to the simple form of the feedback controller, less system information is used and less adjustable parameters are considered, which limits the flexibility and control effect of controller. Specially, the use of delay information is insufficient in the existing results, which will make the controller insensitive to the influence of delay factors and affect the control performance. Thereby, the accurate analysis of the dynamic characteristics of IFOMNNs-MTDs by designing appropriate controller needs further study. This paper aims to propose a new delay-dependent controller to improve the stabilization analysis of IFOMNNs-MTDs. Firstly, a delay-dependent distributed controller with distributed control gain and time-delay state summation term is proposed, which accords with the transmission law of activation function weights and is helpful to consider the historical information (i.e., time delay factors) of system. Thereby, the flexibility and control effect of controller are improved by adding control parameters and improving the use of system information. Secondly, a new less-conservatism stabilization criterion of IFOMNNs-MTDs is established by using the designed controller. Moreover, the controller gain can be searched in a large range by using the established criterion. Finally, a numerical simulation is provided to verify the validity of the obtained results. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
381. Unsupervised performance analysis of 3D face alignment with a statistically robust confidence test.
- Author
-
Sadeghi, Mostafa, Alameda-Pineda, Xavier, and Horaud, Radu
- Subjects
- *
MATHEMATICAL mappings , *CONFIDENCE , *INTEGRATED software , *PUBLISHED articles , *EXPECTATION-maximization algorithms - Abstract
This paper addresses the problem of analyzing the performance of 3D face alignment (3DFA), or facial landmark localization. This task is usually supervised, based on annotated datasets. Nevertheless, in the particular case of 3DFA, the annotation process is rarely error-free, which strongly biases the results. Alternatively, unsupervised performance analysis (UPA) is investigated. The core ingredient of the proposed methodology is the robust estimation of the rigid transformation between predicted landmarks and model landmarks. It is shown that the rigid mapping thus computed is affected neither by non-rigid facial deformations, due to variabilities in expression and in identity, nor by landmark localization errors, due to various perturbations. The guiding idea is to apply the estimated rotation, translation and scale to a set of predicted landmarks in order to map them onto a mathematical home for the shape embedded in these landmarks (including possible errors). UPA proceeds as follows: (i) 3D landmarks are extracted from a 2D face using the 3DFA method under investigation; (ii) these landmarks are rigidly mapped onto a canonical (frontal) pose, and (iii) a statistically-robust confidence score is computed for each landmark. This allows to assess whether the mapped landmarks lie inside (inliers) or outside (outliers) a confidence volume. An experimental evaluation protocol, that uses publicly available datasets and several 3DFA software packages associated with published articles, is described in detail. The results show that the proposed analysis is consistent with supervised metrics and that it can be used to measure the accuracy of both predicted landmarks and of automatically annotated 3DFA datasets, to detect errors and to eliminate them. Source code and supplemental materials for this paper are publicly available at https://team.inria.fr/robotlearn/upa3dfa/. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
382. Source-Free Unsupervised Domain Adaptation: Current research and future directions.
- Author
-
Zhang, Ningyuan, Lu, Jie, Li, Keqiuyin, Fang, Zhen, and Zhang, Guangquan
- Subjects
- *
PHYSIOLOGICAL adaptation , *DEEP learning - Abstract
In the field of Transfer Learning, Source-Free Unsupervised Domain Adaptation (SFUDA) emerges as a practical and novel task that enables a pre-trained model to adapt to a new unlabeled domain without access to the original training data. The advancement of SFUDA has profoundly reshaped the algorithmic design of domain adaptation methods. Given the novelty and limited exploration of SFUDA, conducting a comprehensive survey is imperative to showcase methodological advancements, identify existing gaps, and uncover potential trends in this field. This paper provides an extensive review of SFUDA, encompassing methods and applications. First, based on the learning objectives during adaptation, different SFUDA methods fall into three categories: (i) Self-Tuning , (ii) Feature Alignment , and (iii) Sample Generation , with further sub-categorization within each category. Additionally, the strengths and limitations of each category are discussed, and various application areas where SFUDA can yield significant benefits are summarized. Finally, drawing from extensive observations and insights, potential future directions for SFUDA research are analyzed, with a focus on identifying emerging trends and key areas for further exploration. • A Comprehensive Review of SFUDA : The paper offers a review of SFUDA, to fill the gaps left by previous studies. • Critical Discussions on SFUDA : Based on the review, critical discussions are highlighted to deepen the understandings. • Future Directions for SFUDA : The paper outlines research directions for SFUDA, providing guidance to future advancements. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
383. Intrusion detection for Industrial Internet of Things based on deep learning.
- Author
-
Lu, Yaoyao, Chai, Senchun, Suo, Yuhan, Yao, Fenxi, and Zhang, Chen
- Subjects
- *
INTRUSION detection systems (Computer security) , *DEEP learning , *INTERNET of things , *FEATURE selection , *HIERARCHICAL clustering (Cluster analysis) , *GREEDY algorithms , *INTERNET security - Abstract
Intrusion detection technology can actively detect abnormal behaviors in the network and is important to the security of Industrial Internet of Things (IIOT). However, there are some issues with the current intrusion detection technology for IIOT, such as extreme imbalance in the number of samples of different classes in the dataset, redundant and meaningless features in the samples, and the inability of traditional intrusion detection methods to meet the requirements of detection accuracy in the increasingly complex IIOT. In view of the extreme imbalance of classes, this paper applies the hierarchical clustering algorithm to the under-sampling technology, which reduces the number of majority samples while reducing the loss of information of majority samples, and solves the problem of missing detection and false detection of minority samples caused by sample imbalance. In order to avoid feature redundancy and interference, this paper proposes an optimal feature selection algorithm based on greedy thought. This algorithm can obtain the optimal feature subset of each type of data in the data set, thus eliminating redundant and interfering features. Aiming at the problem of insufficient detection ability of traditional detection methods, this paper proposes a deep neural network intrusion detection model based on the parallel connection of global and local subnetworks. This model obtains the overall benchmark detection of the dataset through the deep neural network, and then strengthens the detection effect of each subclass through the parallel connection of subnetworks, greatly improving the performance of the intrusion detection algorithm. The experimental results show that the method described in this paper can improve the intrusion detection for IIOT. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
384. Action decoupled SAC reinforcement learning with discrete-continuous hybrid action spaces.
- Author
-
Xu, Yahao, Wei, Yiran, Jiang, Keyang, Chen, Li, Wang, Di, and Deng, Hongbin
- Subjects
- *
REINFORCEMENT learning , *BLENDED learning , *DEEP learning , *DRONE aircraft , *SPACE environment - Abstract
Most existing Deep Reinforcement Learning (DRL) algorithms solely apply to discrete action or continuous action spaces. However, the agent often has both continuous and discrete action space, named hybrid action space. This paper proposes an action-decoupled algorithm for hybrid action space. Specifically, the hybrid action is decoupled, and then the original agent in the hybrid action space is abstracted into two agents. Each agent contains only discrete or continuous action space. The discrete and continuous actions are independent of each other to be executed simultaneously. We use the Soft Actor-Critic (SAC) algorithm as the optimization method and name our proposed algorithm Action Decoupled SAC (AD-SAC). We handle multi-agent problems using a framework of Centralized Training Decentralized Execution (CTDE) and then reduce the concatenation of partial agent observations to avoid the interference of redundant observations. We design a hybrid action space environment for Unmanned Aerial Vehicles (UAVs) path planning and gimbal scanning using AirSim. The results show that our algorithm has better convergence and robustness than the discretization, relaxation, and the Parametrized Deep Q-Networks Learning (P-DQN) algorithms. Finally, we carried out a Hardware in the Loop (HITL) simulation experiment based on Pixhawk to verify the feasibility of our algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
385. A lightweight backdoor defense framework based on image inpainting.
- Author
-
Wei, Yier, Gao, Haichang, Wang, Yufei, Gao, Yipeng, and Liu, Huan
- Subjects
- *
ARTIFICIAL neural networks , *INPAINTING , *PAINT - Abstract
Deep neural networks (DNNs) have been shown to be vulnerable to backdoor attacks during training. Most of the existing backdoor defense methods are designed for specific types of backdoor attacks, and the work of detecting backdoors and mitigating backdoors is mostly separate. Currently, few general and complete defense frameworks have been developed. In this paper, we propose a lightweight, general, and complete defense framework against three main types of backdoor attacks. It can efficiently detect poisoned images and remove trigger patterns on poisoned images without costly retraining of the backdoor model. First, we use the feature difference between clean samples and poisoned samples in the middle layer of the model to distinguish them. Then, we remove the backdoor using image inpainting algorithm to remove the backdoor triggering pattern on the poisoned samples. We deploy three of the most popular backdoor attacks on three datasets to test the effectiveness of our defenses. Extensive experimental results show that our method can effectively defend against various backdoor attacks with a relatively small cost. In particular, we reduce the attack success rate of the more stealthy clean-label poisoning attack from 94.9% to 0.02% with little impact on the classification accuracy of the inpainted images. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
386. Textual tag recommendation with multi-tag topical attention.
- Author
-
Xu, Pengyu, Xia, Mingxuan, Xiao, Lin, Liu, Huafeng, Liu, Bing, Jing, Liping, and Yu, Jian
- Subjects
- *
TAGS (Metadata) , *INFORMATION retrieval , *INFORMATION services , *RECOMMENDER systems , *USER-generated content , *NEUROPROSTHESES - Abstract
Tagging can be regarded as the action of connecting relevant user-defined keywords to an item, indirectly improving the quality of the information retrieval services that rely on tags as data sources. Tag recommendation dramatically enhances the quality of tags by assisting users in tagging. Although there exist many studies on tag recommendation for textual content, few of them consider two characteristics in real applications, i.e., the long-tail distribution of tags and the topic-tag correlation. In this paper, we propose a Topic-Guided Tag Recommendation (TGTR) model to recommend tags by jointly incorporating dynamic neural topic. Specifically, TGTR first generates dynamic neural topic that would indicate the tags by a neural topic generator. Then, a sequence encoder is used to distill indicative features from the post. To effectively leverage the topic and alleviate the data imbalance, we design a multi-tag topical attention mechanism to get a tag-specific post representation for each tag with the help of dynamic neural topic. These three modules are seamlessly joined together via an end-to-end multi-task learning model, which is helpful for the three parts to enhance each other and balance the effects of topics and tags. Extensive experiments have been conducted on four real-world datasets and demonstrate that our model outperforms the state-of-the-art approaches by a large margin, especially on tail-tags. The code, data and hyper-parameter settings are publicly released for reproducibility. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
387. Unbiased feature position alignment for human pose estimation.
- Author
-
Wang, Chen, Zhou, Yanghong, Zhang, Feng, and Mok, P.Y.
- Subjects
- *
HUMAN beings , *INTERPOLATION algorithms , *INTERPOLATION , *PROBLEM solving , *UNBIASED estimation (Statistics) - Abstract
Multi-scale feature fusion is a commonly-used module in existing deep-learning models, and feature misalignment occurs in the process of feature fusion. The spatial misalignment hinders the learning of semantic representation with multi-scale levels, but which has not received much attention. This misalignment problem is caused by the feature position shift after using the convolution and interpolation operation in feature fusion. To solve the misalignment problem, this paper formulates the shift error mathematically and proposes a plug-and-play unbiased feature position alignment strategy to align convolution with interpolation. As a model-agnostic approach, unbiased feature position alignment can boost the performance of different models without introducing extra parameters. Furthermore, the unbiased feature position alignment is applied to build an unbiased human pose estimation method. Experimental results have demonstrated the effectiveness of the proposed unbiased pose model in comparison to the state-of-the-arts, especially in the low-resolution field. The codes are shared at https://github.com/WangChen100/Unbiased-Feature-Position-Alignment-for-Human-Pose-Estimation. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
388. Data hiding during image processing using capsule networks.
- Author
-
Wang, Zichi, Feng, Guorui, Wu, Hanzhou, and Zhang, Xinpeng
- Subjects
- *
CAPSULE neural networks , *HISTOGRAMS , *IMAGE processing , *DIGITAL images , *ELECTRONIC data processing - Abstract
In daily life, some conventional image processing operations, e.g., histogram equalization, and filtering, are widely used to improve the visual quality of digital images. This paper designs an image processing network based on CapsNets (capsule networks), in which additional data can be carried in the processed image. Given an image to be processed, the proposed network is able to achieve some conventional image processing operations with satisfactory results. Meanwhile, additional data can be embedded into the processed image during the process of training, and the existence of additional data cannot be discovered. In this way, additional data can be transmitted secretly via the processed image, which looks normal. Compared with existing data hiding algorithms that embed data by modifying image content, the proposal to embed data during the process of training is more secure. Experimental results verify the effectiveness of the proposed network, including the quality of the processed image, embedding capacity, and security. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
389. Adversarial patch attacks against aerial imagery object detectors.
- Author
-
Tang, Guijian, Jiang, Tingsong, Zhou, Weien, Li, Chao, Yao, Wen, and Zhao, Yong
- Subjects
- *
ARTIFICIAL neural networks , *OBJECT recognition (Computer vision) , *AERIAL bombing , *DETECTORS - Abstract
Although Deep Neural Networks (DNNs)-based object detectors are widely used in various fields, especially on aerial imagery object detections, it has been observed that a small elaborately designed patch attached to the images can mislead the DNNs-based detectors into producing erroneous output. However, the target detectors being attacked are quite simple, and the attack efficiency is relatively low in previous works, making it not practicable in real scenarios. To address these limitations, a new adversarial patch attack algorithm is proposed in this paper. Firstly, we designed a novel loss function using the intermediate outputs of the models rather than the model's final outputs interpreted by the detection head to optimize adversarial patches. The experiments conducted on the DOTA, RSOD, and NWPU VHR-10 datasets demonstrate that our method can significantly degrade the performance of the detectors. Secondly, we conducted intensive experiments to investigate the impact of different outputs of the detection model on generating adversarial patches, demonstrating the class score is not as effective as the objectness score. Thirdly, we comprehensively analyzed the attack transferability across different aerial imagery datasets, verifying that the patches generated on one dataset are also effective in attacking another. Moreover, we proposed ensemble training to boost the attack's transferability across models. Our work alarms the application of DNNs-based object detectors in aerial imagery. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
390. RSBNet: One-shot neural architecture search for a backbone network in remote sensing image recognition.
- Author
-
Peng, Cheng, Li, Yangyang, Shang, Ronghua, and Jiao, Licheng
- Subjects
- *
DEEP learning , *REMOTE sensing , *IMAGE recognition (Computer vision) , *EVOLUTIONARY algorithms , *SPINE - Abstract
Recently, a massive number of deep learning-based approaches have been successfully applied to various remote sensing image (RSI) recognition tasks. However, most existing advances of deep learning methods in the RSI field heavily rely on the features extracted by the manually designed backbone network, which severely hinders the potential of deep learning models due to the complexity of RSI and the limitation of prior knowledge. In this paper, we research a new design paradigm for the backbone architecture in RSI recognition tasks, including scene classification, land-cover classification and object detection. A novel one-shot architecture search framework based on a weight-sharing strategy and an evolutionary algorithm is proposed, called RSBNet, which consists of three stages: Firstly, a supernet constructed in a layer-wise search space is pretrained on a self-assembled large-scale RSI dataset based on an ensemble single-path training strategy. Next, the pre-trained supernet is equipped with different recognition heads through the switchable recognition module and respectively fine-tuned on the target dataset to obtain task-specific supernet. Finally, we search for the optimal backbone architecture for different recognition tasks based on the evolutionary algorithm without any network training. Extensive experiments have been conducted on five benchmark datasets for different recognition tasks, the results show the effectiveness of the proposed search paradigm and demonstrate that the searched backbone is able to flexibly adapt different RSI recognition tasks and achieve impressive performance. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
391. Adaptive Regularized Warped Gradient Descent Enhances Model Generalization and Meta-learning for Few-shot Learning.
- Author
-
Rao, Shuzhen, Huang, Jun, and Tang, Zengming
- Subjects
- *
MACHINE learning - Abstract
Warped Gradient descent (WarpGrad) is a remarkable meta-learning method for gradient transformation by inserting warp-layers. However, the task-shared initialization provided by WarpGrad is difficult to be adaptive to each task. Moreover, transforming gradients with meta-learned warp-layers ignores the local geometric features or task-specific knowledge, and may lead to a significant risk of overfitting caused by the increase of parameters. In this paper, we propose ARWarpGrad to guarantee better generalization performance with faster convergence speed by modeling both the cross-task and task-specific knowledge. We introduce Initialization Modulation (IM) to meta-learn to initialize the task-learner specifically. Furthermore, the Mixed Gradient Preprocessing (MGP), which includes the Adaptive Learning Rates (ALR) and the Gaussian Momentum Dropout (GMD), is put forward to provide better adaptive optimization direction and length for task adaptation based on the feature of local geometries. In addition, Memory Regularization (MR) is provided to alleviate the overfitting problem effectively with the use of parameter memory. Ultimately, extensive experiments on three settings demonstrate that ARWarpGrad achieves state-of-the-art performance with convergence acceleration and overfitting prevention characteristics. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
392. LLP-AAE: Learning from label proportions with adversarial autoencoder.
- Author
-
Wang, Bo, Sun, Yingte, and Tong, Qiang
- Subjects
- *
MACHINE learning , *SUPERVISED learning , *DEEP learning - Abstract
This paper presents an effective weakly supervised learning algorithm LLP-AAE to leverage the adversarial autoencoder (AAE) for learning from label proportions (LLP), in which only the bag-level proportional information is available. Our LLP-AAE utilizes an autoencoder backbone and performs adversarial training in latent space to match the aggregated posterior distribution of hidden coding with the prior distributions. In this way, apart from the reconstruction task, the encoder is also dedicated to producing fake samples, in order to deceive discriminators as far as possible. Ultimately, the encoder is employed as a competent label predictor for unseen data. In addition to the LLP classifier, our model can also achieve controllable samples generation by feeding the decoder with gradually changing latent code, which is proven to be useful for a better LLP performance. We also provide a panoramic explanation for LLP-AAE by regarding the LLP problem as an alternative learning procedure between proportion-based pseudo label generation and discriminative reconstruction. Experiments on six benchmark image datasets demonstrate the advantage of our method both in style manipulation with the latent feature representation and comparable multi-class LLP performance with the state-of-the-art models. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
393. TransAM: Transformer appending matcher for few-shot knowledge graph completion.
- Author
-
Liang, Yi, Zhao, Shuai, Cheng, Bo, and Yang, Hao
- Subjects
- *
KNOWLEDGE graphs , *PROBLEM solving - Abstract
Few-shot knowledge graph completion (FSKGC) refers to predicting new facts for a new relation with only few-shot observed entity pairs (triples) as support set. Existing solutions to FSKGC mainly conduct the matching process over entity pair representations. Although effective, a major concern of these models is that the entity interactions are not fully explored, based on the observation that they usually generate the pair representation before the matching stage. Such a design inherently overlooks the fine-grained information from entity interactions, leading to performance decrements in one or three shot, which require matching models to capture more sufficient semantic meanings for prediction. To remedy this issue, in this paper, we explore the entity interactions within and between different instances, i.e. , the co-occurrence of two entities, for FSKGC and propose our model named TransAM, Trans former A ppending M atcher. TransAM solves the FSKGC problem by computing the probability of entity sequence with a well-designed transformer matching network. Specifically, TransAM appends query entity pair to serialized reference entity sequence and utilizes transformer to calculate the probability by capturing intra- and inter- triple entity interactions. To bridge the gap between transformer and the triple structure, TransAM introduces rotary operation to preserve the head and tail roles of entity within the triple and distinguishes different triples by a separated triple position encoding. Empirical studies on two public benchmark datasets NELL-One and Wiki-One show that TransAM outperforms existing metric-learning solutions in MRR and Hits@1 with both one- and three- shot settings, and achieves comparable results on five-shot setting. Datasets and code will be public available at https://github.com/gawainx/TransAM. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
394. An efficient fine-grained vehicle recognition method based on part-level feature optimization.
- Author
-
Lu, Lei, Cai, Yancheng, Huang, Hua, and Wang, Ping
- Subjects
- *
RECOGNITION (Psychology) , *FEATURE extraction , *DEEP learning , *OBJECT recognition (Computer vision) , *VEHICLES , *VISUALIZATION , *DESCRIPTOR systems - Abstract
This paper presents an effective method for strengthening the discriminative ability of high-level deep features by enhancing and aggregating discriminative part-level features for the fine-grained vehicle recognition task. In general, the task of visual recognition concentrates more on the visual differences at the object level. However, for fine-grained object recognition, the visual differences between target objects typically exist in local discriminative areas, so it is more concerned about extracting fine-grained features from these part regions. In this context, we propose solving this issue with a novel feature extraction method from two perspectives: the generation of more feature descriptors of part regions through the learning process of deep networks and the aggregation of part-level discriminative features. This approach is designed to improve the backbone networks to generate finer-level part features through a part-level feature enhancement module and to investigate the intrinsic part-level features of the backbone networks with the help of a feature aggregation module. The enhancement module efficiently finds the finer features highly correlated to the part regions. Then the feature aggregation module builds correlations of similar part features through feature grouping and fusion. Moreover, our proposed method does not require additional parts annotations and achieves comparable performance on two widely-used benchmarks for recognizing fine-grained vehicle types. Experimental results and explainable visualizations demonstrate the effectiveness of the proposed method. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
395. Contrastive structure and texture fusion for image inpainting.
- Author
-
Chen, Long, Yuan, Changan, Qin, Xiao, Sun, Wei, and Zhu, Xiaofeng
- Subjects
- *
IMAGE fusion , *INPAINTING , *SEMANTICS - Abstract
Most recent U-Net based models have shown promising results for the challenging tasks in image inpainting field. However, they often generate content with blurred textures and distorted structures due to the lack of semantic consistency and texture continuity in the missing regions. In this paper, we propose to restore the missing areas at both structural and textural levels. Our method is built upon a U-Net structure, which repairs images by extracting semantic information from high to low resolution and then decoding it back to the original image. Specifically, we utilize the high-level semantic features learned in encoder to guide the inpainting of structure-aware features of its adjacent low-level feature map. Meanwhile, low-level feature maps have clearer texture compared with high-level ones, which can be used as a prior for textural repair of high-level feature maps. subsequently, a module is used to fuse the two repaired feature maps (i.e., structure-aware and texture-aware features) reasonably and obtain a feature map with reasonable semantics. Moreover, in order to learn more representative high-level semantics feature, we design the model as a siamese network for contrastive learning. Experiments on practical data show that our method outperforms other state-of-the-art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
396. Neural network model based on global and local features for multi-view mammogram classification.
- Author
-
Xia, Lili, An, Jianpeng, Ma, Chao, Hou, Hongjun, Hou, Yanpeng, Cui, Linyang, Jiang, Xuheng, Li, Wanqing, and Gao, Zhongke
- Subjects
- *
COMPUTER-aided diagnosis , *MAMMOGRAMS , *MEDICAL screening , *BREAST , *DEEP learning , *CLASSIFICATION - Abstract
Mammography is an important screening criterion for breast cancer, one of the major diseases causing numerous deaths among female patients. Meanwhile, manual diagnosis of mammography is a time-consuming and labor-consuming job. Mammogram classification based on deep learning plays a vital role in computer-aided diagnosis (CAD) systems to mitigate the pressure on physicians. This paper proposes a learning-based multi-view mammogram classification model that captures long-distance dependence and extracts features of multiple receptive fields. Our model considers global and local features of mammography images using Transformer for global features and the proposed multiplex convolutions module for local features. We evaluate our proposed method on a dataset of mammography images obtained from a hospital in China. The proposed method achieves 90.57% accuracy and 94.86% AUC in benign or malignant classification tasks and outperforms other advanced methods for mammogram classification. It is worth noting that the proposed method only requires image-level labels and acts on the whole raw mammogram, which has clinical significance. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
397. Distributed finite-time optimization algorithms with a modified Newton–Raphson method.
- Author
-
Wang, Dong and Gao, Zhenzhen
- Subjects
- *
OPTIMIZATION algorithms , *NEWTON-Raphson method , *MATRIX inversion , *HESSIAN matrices , *MULTIAGENT systems - Abstract
In this paper, we propose a distributed finite-time optimization protocol for single-integrator continuous-time multi-agent systems, which utilizes a modified Newton–Raphson method. The inverse of Hessian matrix, sign function and the gradient are adopted for the design of the algorithms. The proposed algorithms make agents converge to the network optimizer under any initial state and finite time, respectively. Lyapunov method and the properties of sign function are employed to verify the convergence of the proposed algorithms. Besides, the adaptive method is also considered to avoid complex parameter conditions and centralized parameters. Finally, numerical simulations are provided to testify the presented results. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
398. Hessian regularization of deep neural networks: A novel approach based on stochastic estimators of Hessian trace.
- Author
-
Liu, Yucong, Yu, Shixing, and Lin, Tong
- Subjects
- *
ARTIFICIAL neural networks , *DYNAMICAL systems , *DATA augmentation , *LYAPUNOV stability - Abstract
[Display omitted] • Connecting Hessian trace with a generalization error bound. • Flat minima of loss landscape and stability analysis in dynamical systems. • Efficient Hessian trace regularization algorithm with Dropout. • Performance comparison on vision and language tasks. In this paper, we develop a novel regularization method for deep neural networks by penalizing the trace of Hessian. This regularizer is motivated by a recent guarantee bound of the generalization error. We explain its benefits in finding flat minima and avoiding Lyapunov stability in dynamical systems. We adopt the Hutchinson method as a classical unbiased estimator for the trace of a matrix and further accelerate its calculation using a Dropout scheme. Experiments demonstrate that our method outperforms existing regularizers and data augmentation methods, such as Jacobian, Confidence Penalty, Label Smoothing, Cutout, and Mixup. The code is available at https://github.com/Dean-lyc/Hessian-Regularization. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
399. Expert-guided contrastive learning for video-text retrieval.
- Author
-
Lee, Jewook, Lee, Pilhyeon, Park, Sungho, and Byun, Hyeran
- Subjects
- *
COMPUTER vision - Abstract
• We propose Expert-guided COntrastive learning (ECO) to learn cross-modal alignment. • Through ablation studies, we validate the effectiveness of each proposed component. • Ours shows improvement with the expert-based framework and compatability with CLIP. Transformers with collaborative experts have become a powerful framework for video-text retrieval. In specific, experts understand the specialized property of each domain (e.g. , appearance, motion, and audio) from videos and the video encoder aggregates those expertise features. However, previous works implicitly guide the video transformer by solving auxiliary video-text tasks with expertise features, since concatenation for the video transformer is the only effort to exploit the knowledge of experts. In this paper, we propose an expert-guided contrastive loss in order to fully exploit expert knowledge from videos. In detail, we sample a positive bag using an expert-wise similarity matrix to learn text encoder and decompose text representation into dynamic and static factors from given videos. Through extensive experiments, we verify the effectiveness of the proposed methods. Notably, we also demonstrate that our method brings significant improvements under the expert-based framework and it can collaborate with CLIP-based architectures for further performance boosts. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
400. Land use and land cover classification with hyperspectral data: A comprehensive review of methods, challenges and future directions.
- Author
-
Moharram, Mohammed Abdulmajeed and Sundaram, Divya Meena
- Subjects
- *
ZONING , *GENERATIVE adversarial networks , *CONVOLUTIONAL neural networks , *LAND use , *RECURRENT neural networks , *DROUGHT management - Abstract
Recently, many efforts have been concentrated on land use land cover (LULC) classification due to rapid urbanization, environmental pollution, agriculture drought, frequent floods, and climate change. However, various aspects have attracted hyperspectral imaging due to there being informative discriminative features, such as spectral-spatial features. To this end, this paper is a comprehensive and systematic review of LULC classification using hyperspectral images by reviewing four significant research investigations. Moreover, the four investigations have addressed the following points: (1) the main components of the hyperspectral imaging, the modes of hyperspectral imaging with data acquisition methods, and the intrinsic differences between hyperspectral image and multispectral image, (2) the role of machine learning in LULC classification, and the standard deep learning methods: Convolution Neural Network (CNN), Stacked Autoencoder (SAE), Deep Belief Network (DBN), Recurrent Neural Network (RNN), and Generative Adversarial Network (GAN), (3) the standard benchmark hyperspectral datasets and the evaluation criteria, (4) the main challenges of LULC classification with the possible solutions for the limited training samples issue, the promising future directions, and finally the recent applications for LULC classification. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.