Journal: neurocomputing / Language: undetermined / Publication Year Range: Last 3 years / Search Limiters: References Available / Topic: computer - Searchworks@Jio Institute Digital Library Search Results

Showing total 104 results

Start Over Search Limiters References Available Topic computer Publication Year Range Last 3 years Language undetermined Journal neurocomputing

104 results

1. Generating training images with different angles by GAN for improving grocery product image recognition

Author: Byeong Ho Kang, Yuchen Wei, Sabera Hoque, and Shuxiang Xu
Subjects: Training set, Computer science, business.industry, Cognitive Neuroscience, Deep learning, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Training (meteorology), Mutual information, Machine learning, computer.software_genre, Computer Science Applications, Artificial Intelligence, Product image, Product (category theory), Artificial intelligence, business, computer
Abstract: Image recognition based on deep learning methods has gained remarkable achievements by feeding with abundant training data. Unfortunately, collecting a tremendous amount of annotated images is time-consuming and expensive, especially in grocery product recognition tasks. It is challenging to recognise grocery products accurately when the deep learning model is trained with insufficient data. This paper proposes multi-angle Generative Adversarial Networks (MAGAN), which can generate realistic training images with different angles for data augmentation. Mutual information is employed in the novel GAN to achieve the learning of angles in an unsupervised manner. This paper aims to create training images containing grocery products from different angles, thus improving grocery product recognition accuracy. We first enlarge the fruit dataset by using MAGAN and the state-of-the-art GAN variants. Then, we compare the top-1 accuracy results from CNN classifiers trained with different data augmentation methods. Finally, our experiments demonstrate that the MAGAN exceeds the existing GANs for grocery product recognition tasks, obtaining a significant increase in the accuracy.
Published: 2022

2. What-Where-When Attention Network for video-based person re-identification

Author: Yangxu Wu, Ping Chen, Hongying Meng, Tao Lei, and Chenrui Zhang
Subjects: Focus (computing), Exploit, business.industry, Computer science, Cognitive Neuroscience, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Machine learning, computer.software_genre, Computer Science Applications, Discriminative model, Artificial Intelligence, Feature (computer vision), Identity (object-oriented programming), Graph (abstract data type), Artificial intelligence, business, Spatial analysis, computer, Feature learning
Abstract: Video-based person re-identification plays a critical role in intelligent video surveillance by learning temporal correlations from consecutive video frames. Most existing methods aim to solve the challenging variations of pose, occlusion, backgrounds and so on by using attention mechanism. They almost all draw attention to the occlusion and learn occlusion-invariant video representations by abandoning the occluded area or frames, while the other areas in these frames contain sufficient spatial information and temporal cues. To overcome these drawbacks, this paper proposes a comprehensive attention mechanism covering what, where, and when to pay attention in the discriminative spatial-temporal feature learning, namely What-Where-When Attention Network (W3AN). Concretely, W3AN designs a spatial attention module to focus on pedestrian identity and obvious attributes by the importance estimating layer (What and Where), and a temporal attention module to calculate the frame-level importance (when), which is embedded into a graph attention network to exploit temporal attention features rather than computing weighted average feature for video frames like existing methods. Moreover, the experiments on three widely-recognized datasets demonstrate the effectiveness of our proposed W3AN model and the discussion of major modules elaborates the contributions of this paper.
Published: 2022

3. Smart surgical control under RCM constraint using bio-inspired network

Author: Ameer Tamoor Khan and Shuai Li
Subjects: Test bench, ComputingMethodologies_SIMULATIONANDMODELING, Computer science, Cognitive Neuroscience, Control (management), ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Computer Science Applications, Task (project management), Constraint (information theory), Artificial Intelligence, Nonlinear model, Point (geometry), MATLAB, computer, Surgical robot, Simulation, ComputingMethodologies_COMPUTERGRAPHICS, computer.programming_language
Abstract: In this paper, we propose a control framework for intelligent surgical robots under the Remote Center of Motion (RCM). The goal of a surgical robot is to assist surgeons in performing complex surgeries. RCM constraint implies that the surgical tip attached to the end-effector of the surgical robot does not slide away from the point of the incision while performing surgery. Implementation of a control algorithm to comply with RCM constraints is a complicated task because of the nonlinear model of the surgical robots and stringent conditions of accuracy imposed by the patient’s safety. This paper proposes an optimization-driven approach to perform the surgical maneuver under RCM constraints. We then applied a bio-inspired optimization algorithm to solve the problem efficiently. For testing the performance of ZNNBAS, we used MATLAB to simulate a surgical procedure. A 7-DOF surgical robot (KUKA LBR IIWA 7) was used as a test bench for running the simulations. The simulation results show that the ZNNBAS is comparable with BAS, PSO, and GA and efficiently and robustly performed the task commanded maneuvers while enforcing the RCM constraints.
Published: 2022

4. Towards embedding information diffusion data for understanding big dynamic networks

Author: Hong Yang, Qingfeng Tan, Peng Zhang, Li Gao, Haishuai Wang, Chuan Zhou, and Zhao Li
Subjects: Class (computer programming), Diffusion (acoustics), Computer science, Cognitive Neuroscience, Node (networking), Regular polygon, Link (geometry), computer.software_genre, Regularization (mathematics), Computer Science Applications, Global optimal, Artificial Intelligence, Embedding, Data mining, computer
Abstract: Dynamic networks are popularly used to describe networks that change with time. Although there have been a large number of research works on understanding dynamic networks using link prediction, node classification and community detection, there is rare work that is specially designed to address the challenge of big network size of dynamic networks. To this end, we study in this paper an emerging and challenging problem of network coarsening in dynamic networks. Network coarsening refers to a class of network “zoom-out” operations where node pairs and edges are grouped together for efficient analysis on big networks. However, existing network coarsening approaches can only handle static networks where network structure weights have been predefined before the coarsening calculation. Under the observation that big networks are highly dynamic and naturally change over time, we consider in this paper to embed information diffusion data which reflect the dynamics of networks for network coarsening. Specifically, we present a new Semi-NetCoarsen approach that jointly maximizes the likelihood of observing the information diffusion data and minimizes the network regularization with respect to the predefined network structural data. The learning function is convex and we use the accelerated proximal gradient algorithm to obtain the global optimal solution. We conduct experiments on two synthetic and five real-world data sets to validate the performance of the proposed method.
Published: 2021

5. Verification mechanism to obtain an elaborate answer span in machine reading comprehension

Author: Weizhong Qian, Yu Peng, Shijie Hu, Yu Luo, Jingkuan Song, and Xiaoyu Li
Subjects: Computer science, business.industry, Cognitive Neuroscience, media_common.quotation_subject, computer.software_genre, Abstract machine, Computer Science Applications, Task (project management), Focus (linguistics), Comprehension, Reading comprehension, Artificial Intelligence, Reading (process), Artificial intelligence, business, Encoder, computer, Natural language processing, Block (data storage), media_common
Abstract: Machine reading comprehension (MRC) is a challenging task in natural language processing (NLP), which requires machine to determine the corresponding answer to a given passage and question. Whereas, there always exist unanswerable questions in the real world, which poses a new challenge to MRC tasks. Abundant research work has been carried out on the verification mechanism for the answerability of passage-question pairs. However, these researches only focus on its design and implementation, which has limitations in real-world scenarios. Thus, the method proposed in this paper not only verifies the answerability, but also validates and adjusts the predicted answer to obtain an elaborate answer span. Using powerful pre-trained model as encoder block, this paper explores a more comprehensive verification mechanism. Similar to how humans read passages and give answers, we propose a three-stage mechanism called ”Verification for an Elaborate Span” (V4ES): 1) sketchy reading that the model briefly browses the overall information of the passage and question, and then generates an initial answer; 2) intensive reading that it reads the passage and question again, judges the answerability of the question and gives an answer at this stage; 3) verification that it verifies these two answers produced at the previous two stages, and then gives the final prediction. Moreover, the proposed model is evaluated on two MRC challenge datasets: SQuAD2.0 and CMRC2018, and the experiment results show that our model has achieved great improvement compared with the ALBERT and BERT baselines. In conclusion, our proposed verification mechanism has demonstrated its effectiveness through a series of experiments and analysis.
Published: 2021

6. JTSG: A joint term-sentiment generator for aspect-based sentiment analysis

Author: Zuocheng Li, Lishuang Li, Anqiao Zhou, and Hongbin Lu
Subjects: Computer science, business.industry, Cognitive Neuroscience, Sentiment analysis, computer.software_genre, Computer Science Applications, Generative model, Artificial Intelligence, Benchmark (computing), Relevance (information retrieval), Artificial intelligence, business, computer, Encoder, Generative grammar, Natural language processing, Sentence, Generator (mathematics)
Abstract: This paper focuses on two related sub-tasks of aspect-based sentiment analysis, namely aspect-term extraction and aspect sentiment classification. The former aims to extract aspect-terms from given sentences and the latter aims to identify the sentiment polarity expressed on the extracted terms. Considering the practical application, researchers use more joint methods rather than pipeline methods. However, existing joint methods cannot model the interaction between aspect-terms and the sentence they belong to, or consider the relevance among the sentiments of different aspect-terms. In this paper, a novel end-to-end generative model based on encoder-decoder, namely Joint Term-Sentiment Generator (JTSG), is presented to generate all aspect term-polarity pairs. Specifically, a pre-trained model based encoder is used to encode the sentences, and specially, the decoder generates the start and end position to determine an aspect-term, rather than generate aspect-terms themselves. This new generative method contributes to avoid generating incomplete aspect-terms. Experimental results demonstrate that the proposed approach yields competitive performance on three benchmark datasets.
Published: 2021

7. Probabilistic faster R-CNN with stochastic region proposing: Towards object detection and recognition in remote sensing imagery

Author: Dewei Yi, Wen-Hua Chen, and Jinya Su
Subjects: Contextual image classification, Computer science, Cognitive Neuroscience, Probabilistic logic, Region proposal, Algorithm robustness, Mixture model, computer.software_genre, Object detection, Computer Science Applications, End-to-end principle, Artificial Intelligence, Data mining, False positive rate, computer
Abstract: Object detection is one of the most important tasks involved in intelligent agriculture systems, especially in pest detection. This paper focuses on a most devastated agricultural disaster: grasshopper plagues. Grasshopper detection and monitoring is of paramount importance in preventing grasshopper plagues. This paper proposes a probabilistic faster R-CNN algorithm with stochastic region proposing, where a probabilistic region proposal network, an image classification network, and an object detection network are integrated to detect and locate grasshoppers. More specifically, in the proposed framework, the probabilistic region proposal network considers attributes (e.g. size, shape) of region proposals and the image classification network identifies the existence of grasshoppers while the object detection network scores recognition confidence for a region proposal. By integrating these three networks, the uncertainty can be passed from end to end, and the final confidence is obtained for each region proposal can be explicitly quantified. To enhance algorithm robustness, a stochastic region proposing algorithm is developed to screen region proposals rather than using a predetermined threshold. The proposed algorithm is validated by recently collected grasshopper datasets. The experimental results demonstrate that the proposed algorithm not only outperforms competing algorithms in terms of average precision (0.91), average missed rate (0.36), and maximum F1-score (0.9263), but also reduces the false positive rate of recognising the existence of grasshoppers in an open field.
Published: 2021

8. STC-NAS: Fast neural architecture search with source-target consistency

Author: Xiaowei Li, Yinhe Han, Zihao Sun, Jilin Mei, Shun Lu, Yu Hu, and Longxing Yang
Subjects: Computer science, Cognitive Neuroscience, Process (computing), Sample (statistics), computer.software_genre, Computer Science Applications, Consistency (database systems), Task (computing), Artificial Intelligence, Leverage (statistics), Data mining, Architecture, Divergence (statistics), computer, Network model
Abstract: Neural architecture search (NAS) has shown very promising results for automatically designing network models. Most existing cell-based NAS approaches generate the target network model from a source super-network, which usually confront inconsistency issues. In this paper, we propose a new NAS method named STC-NAS, a fast neural architecture search with source-target consistency, so that not only the performance of the searched target model is improved but also the search process is boosted. Specifically, during the search phase, we sample the source super-network to let the samples be consistent with the target model. Moreover, we leverage the Jensen-Shannon divergence to ensure the samples are optimized in the direction of being more similar to the target model. Experimental results demonstrate that our method needs only 0.059 GPU-days to search on CIFAR-10. Benefited from its efficiency, STC-NAS can directly search the target super-network on the target task datasets, achieving 2.42% test error on CIFAR-10, 16.45% test error on CIFAR-100, and 24.2% test error on ImageNet datasets.
Published: 2022

9. Off-policy algorithm based Hierarchical optimal control for completely unknown dynamic systems

Author: Jiayu Chen, Xiaohong Cui, Binrui Wang, and Suan Xu
Subjects: Kronecker product, Scheme (programming language), Computer science, Cognitive Neuroscience, Computation, Optimal control, Computer Science Applications, symbols.namesake, Artificial Intelligence, Vectorization (mathematics), Stackelberg competition, symbols, Key (cryptography), Reinforcement learning, Algorithm, computer, computer.programming_language
Abstract: This paper proposes an online reinforcement learning(RL) for solving Stackelberg games with completely unknown dynamic systems. To deal with the hierarchical optimal control problem, the key is to find the solution of a two-level optimal control with the leader’s optimal problem being constrained. Firstly, the leader-follower coupled Hamiltonian-Jacobi(HJ) equations subject to the follower’s costate equation is derived. Secondly, an off-policy scheme is designed following the policy iteration(PI) algorithm based on model. The improved off-policy algorithm is given for obtaining the solution to the leader-follower coupled HJ equations. The algorithm is built without requiring any knowledge of the dynamic system, and the leader-follower optimal control policies are directly constructed by the extracted data. Meanwhile, the existence of the Stackelberg equilibrium is demonstrated. Lastly, NNs for each player are built, and NN learning is accomplished online with Kronecker product technique and vectorization, which simples the NN form and could decrease computation burden. Simulation examples are presented to demonstrate the proposed learning algorithm.
Published: 2022

10. A spatiotemporal multi-feature extraction framework for opinion mining

Author: Tiankuo Li, Juan Li, Xiaojie Sun, Zheng Dong, Hongji Xu, Shidi Fan, Zhi Liu, and Qiang Liu
Subjects: Artificial neural network, Relation (database), Semantic feature, Computer science, business.industry, Cognitive Neuroscience, Deep learning, Feature extraction, Sentiment analysis, computer.software_genre, Field (computer science), Computer Science Applications, Artificial Intelligence, Data mining, Artificial intelligence, business, computer, Natural language
Abstract: With the rapid development of Internet technology and the explosive growth of digital text, opinion mining has become one of the important research hotspots in the field of natural language processing (NLP). In recent years, neural network based deep learning algorithms have been applied in the field of opinion mining. Considering the relation between temporal and spatial dimensions of text data and the characteristics of natural language itself, traditional deep learning algorithms cannot be comprehensive in the processing of fully feature extraction. In this paper, we propose a new deep learning framework for opinion mining, which includes a temporal feature extraction layer that consists of two layers of bidirectional simple recurrent unit (Bi-SRU) networks extracting features at the word and grammar levels; a semantic feature extraction layer that mainly contains a multi-head attention module; a spatial feature extraction layer with dilated convolution that is used to extract opinion preference features. The Internet movie database (IMDb) is used to verify the performance of the proposed framework. The experiment results show that the proposed framework can effectively improve the classification accuracy, whose performance is better than that of the compared algorithms.
Published: 2022

11. Modeling long-term video semantic distribution for temporal action proposal generation

Author: Sicheng Zhao, Xiaoshuai Sun, Tingting Han, and Jun Yu
Subjects: Computer science, business.industry, Cognitive Neuroscience, Context (language use), ENCODE, Machine learning, computer.software_genre, Semantics, Computer Science Applications, Term (time), Action (philosophy), Artificial Intelligence, Benchmark (computing), Embedding, Segmentation, Artificial intelligence, business, computer
Abstract: Video temporal segmentation plays a vital role in video analysis since many higher-level computer vision tasks rely on it. Some recent efforts have been dedicated to generating temporal action proposals for long and untrimmed videos, which requires methods to generate accurate boundaries for video semantics. In this paper, we propose a novel and efficient Temporal Distribution Network (TDN), to model the long-term distribution of video semantic units (video dictionary). Firstly, we encode the semantics and context relations of video segments with a boundary-specified video embedding method. Then based on temporal convolutional layers, we design a Temporal Distribution Network (TDN) enumerating all the possible temporal locations in one pass and generating proposals that have high action confidence scores by capturing the long-term distributions of video semantics. We validate our method on temporal action proposal generation tasks and action detection tasks. Experimental results on two benchmark datasets, THUMOS14 and ActivityNet-1.3, show that the proposed method can significantly outperform the state-of-the-art approaches. Our model could obtain high-quality action proposals with a much faster speed.
Published: 2022

12. Classifying cybergrooming for child online protection using hybrid machine learning model

Author: Gustavo Isaza, Fabián Muñoz, Felipe Buitrago, and Luis Fernando Castillo
Subjects: Hybrid machine, Artificial neural network, business.industry, Computer science, Cognitive Neuroscience, Semantic analysis (machine learning), Context (language use), Machine learning, computer.software_genre, Convolutional neural network, Computer Science Applications, Artificial Intelligence, Classifier (linguistics), False positive paradox, Artificial intelligence, Representation (mathematics), business, computer
Abstract: This paper shows a computational model that classifies Cybergrooming attacks in the context of COP (child online protection) using Natural Language Processing (NLP) and Convolutional Neural Networks (CNN). The model predicts a high number of false positives, therefore low precision and F-score, but a high accuracy. In this issue, where the number of messages in the context of grooming are so low compared to the number of conversations and messages from other contexts, it can be concluded that is a very consistent and useful result as it captures a high number of true positives, considering that the classifier works for messages. Performing the training of machine learning algorithms with neural networks, semantic analysis and NLP, allows approximate representation of knowledge contributing to discovery of pseudo-intelligent information in these environments and reducing human intervention for characterization of underlying abnormal behavior and detecting messages that potentially represent these attacks.
Published: 2022

13. A predictive and user-centric approach to Machine Learning in data streaming scenarios

Author: Paulo Novais, Fábio Silva, Davide Carneiro, and Miguel Guimarães
Subjects: Interface (Java), Computer science, business.industry, Cognitive Neuroscience, Retraining, 02 engineering and technology, Decision problem, Machine learning, computer.software_genre, Abstract machine, Computer Science Applications, Artificial Intelligence, 020204 information systems, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Artificial intelligence, business, computer, User-centered design
Abstract: Machine Learning has emerged in the last years as the main solution to many of nowadays’ data-based decision problems. However, while new and more powerful algorithms and the increasing availability of computational resources contributed to a widespread use of Machine Learning, significant challenges still remain. Two of the most significant nowadays are the need to explain a model’s predictions, and the significant costs of training and re-training models, especially with large datasets or in streaming scenarios. In this paper we address both issues by proposing an approach we deem predictive and user-centric. It is predictive in the sense that it estimates the benefit of re-training a model with new data, and it is user-centric in the sense that it implements an explainable interface that produces interpretable explanations that accompany predictions. The former allows to reduce necessary resources (e.g. time, costs) spent on retraining models when no improvements are expected, while the latter allows for human users to have additional information to support decision-making. We validate the proposed approach with a group of public datasets and present a real application scenario.
Published: 2022

14. A novel parameters correction and multivariable decision tree method for edge computing enabled HGR system

Author: Wei He, Mu Zhou, Yong Wang, and Bang Wang
Subjects: Computational complexity theory, Artificial neural network, Computer science, business.industry, Cognitive Neuroscience, Decision tree, Cloud computing, Machine learning, computer.software_genre, Computer Science Applications, Transplantation, Artificial Intelligence, Gesture recognition, Server, Artificial intelligence, business, computer, Edge computing
Abstract: With the rapid development of cloud computing, Internet of things and artificial intelligence, human–computer interaction (HCI) is playing an increasingly important role in the daily life. As an important component of HCI, hand gesture recognition (HGR) system is usually combined with edge computing server, utilizing machine learning, including neural network, decision tree, integrated learning, to achieve low latency and high reliability service. High precision HGR with low computational complexity is prerequisite for the commercialization of gesture recognition. Therefore, this paper proposed a high-precision parameter correction algorithm based on the established scattered-point model and the outlier detection scheme, and a recognition algorithm with multivariable decision tree is then presented for the dynamic hand gestures. The experimental results show that the proposed algorithms can improve the recognition accuracy and effectively reduce the running time, which is conducive to algorithm transplantation and model deployment in edge servers.
Published: 2022

15. Adaptive multi-task learning for cross domain and modal person re-identification

Author: Lin Xu, Shiyang Yan, and Jianan Zhao
Subjects: Scheme (programming language), Matching (statistics), Source code, Computer science, business.industry, Cognitive Neuroscience, media_common.quotation_subject, Multi-task learning, Viewpoints, Machine learning, computer.software_genre, Visual appearance, Computer Science Applications, Domain (software engineering), Modal, Artificial Intelligence, Artificial intelligence, business, computer, media_common, computer.programming_language
Abstract: Person re-identification (re-ID) aims at matching a person-of-interest across various non-overlap cameras with distinguished visual appearance variances. Pre-existing research methods mainly employ deep neural models to train large-scale person re-ID datasets, achieving good performance. However, these methods are primarily deployed only on visual data, which can be easily influenced by the environment variances (e.g., viewpoints, poses, and illuminations). In this paper, we propose an adaptive multi-task learning (MTL) scheme for cross domain and modal person re-ID. It can effectively utilize the visual and language information from multiple datasets for improving learning performance. Comprehensive experiments are also conducted on the widely-used person re-ID datasets, i.e., Market-1501 and DukeMTMC-reID, validating the effectiveness of the proposed method. It can model the domain difference and the relationship between the vision and language modalities and achieve state-of-the-art performance. The source code of our proposed method will be available at (https://github.com/emdata-ailab/Multitask_Learning_ReID).
Published: 2022

16. Co-attention dictionary network for weakly-supervised semantic segmentation

Author: Weitao Wan, Ming-Hsuan Yang, Jiansheng Chen, and Huimin Ma
Subjects: Class (computer programming), Artificial neural network, Basis (linear algebra), business.industry, Computer science, Cognitive Neuroscience, Pattern recognition, Pascal (programming language), Semantics, Computer Science Applications, Image (mathematics), Artificial Intelligence, Benchmark (computing), Segmentation, Artificial intelligence, business, computer, computer.programming_language
Abstract: In this paper, we propose the co-attention dictionary network (CODNet) for weakly-supervised semantic segmentation using only image-level class labels. The CODNet model exploits extra semantic information by jointly leveraging a pair of samples with common semantics through co-attention rather than processing them independently. The inter-sample similarities of spatially distributed deep features are computed to merge reference features through non-local connections. To discover similar patterns regardless of appearance variations, we propose to extract image representations by equipping the neural networks with dictionary learning which provides the universal basis elements for different images. Based on the CODNet model, we propose a multi-reference class activation map (MR-CAM) algorithm which generates semantic segmentation masks for a target image by jointly merging semantic cues from multiple reference images. Experimental results on the PASCAL VOC 2012 and MSCOCO benchmark datasets for weakly-supervised semantic segmentation show that the proposed algorithm performs favorably against the state-of-the-art methods.
Published: 2022

17. Towards hour-level crime prediction: A neural attentive framework with spatial–temporal-categorical fusion

Author: Weichao Liang, Haicheng Tao, Youquan Wang, and Jie Cao
Subjects: Dependency (UML), Property (programming), business.industry, Computer science, Mechanism (biology), Cognitive Neuroscience, Machine learning, computer.software_genre, Computer Science Applications, Artificial Intelligence, Spatial Dependency, Artificial intelligence, business, computer, Categorical variable
Abstract: As one of the most complex social problems around the world, crime may bring the risk of dying or losing property to the public if not handled properly. Crime prediction which aims at predicting crime incidents before they happen is of great importance to fight against crime. Previous studies are concerned primarily with day-level crime prediction and have certain limitations on modeling complex spatial–temporal-categorical dependency contained in the criminal activities as well as utilizing external factors to facilitate the forecast. In this paper, we develop a novel Neural Attentive framework for Hour-level Crime prediction (NAHC) to cope with these challenges. Specifically, we first adopt the priori knowledge-based data enhancement strategy to alleviate the zero-inflated issue raised in hour-level settings. Then, multi-graph convolutional networks are applied to capture spatial dependency from different aspects. After that, we integrate gated recurrent units with a temporal attention mechanism to jointly address temporal dependency and capture time-sensitive external factors. A categorical attention mechanism is proposed for dealing with categorical dependency and finally a fully connected network is utilized to generate the final prediction results. Extensive experiments on two real-world crime datasets demonstrate the effectiveness of our framework over the state-of-the-art comparing methods.
Published: 2022

18. Spatial-wise and channel-wise feature uncertainty for occluded person re-identification

Author: Hefei Ling, Ping Li, Yuxuan Shi, Weiyi Tian, and Zongyi Li
Subjects: Parsing, Channel (digital image), Computer science, business.industry, Cognitive Neuroscience, Pattern recognition, computer.software_genre, Facial recognition system, Computer Science Applications, Domain (software engineering), Image (mathematics), Artificial Intelligence, Feature (computer vision), Noise (video), Artificial intelligence, business, computer, Pose
Abstract: Occluded person re-identification is a challenging task since the available data often suffers from information incompleteness and spatial misalignment. Most state-of-the-art occluded models rely on the external model to provide additional semantic information. However, for the time being, external models, such as the human parsing model and the pose estimation model cannot provide accurate semantic information under a complex occlusion environment and may introduce errors to the Re-ID model instead. In this paper, we propose an occluded person Re-ID model that mines the latent recognizable information of the person image itself, without the help of external models. Feature/Data uncertainty can reduce the influence of noisy samples in datasets and has been discussed in person Re-ID and face recognition, we extend the uncertainty to the micro feature level, and propose the spatial-wise and channel-wise feature uncertainty to constantly refine the features in the spatial domain and the channel domain respectively during feature construction by weakening the influence of noise features. Extensive experiments on the occluded datasets and holistic datasets have proved the effectiveness of our proposed methods.
Published: 2022

19. An ensemble of random decision trees with local differential privacy in edge computing

Author: Xiaotong Wu, Xiaolong Xu, Lianyong Qi, Jiaquan Gao, and Genlin Ji
Subjects: Information privacy, Computer science, Cognitive Neuroscience, Decision tree, Computer security, computer.software_genre, Computer Science Applications, Task (project management), Key factors, Artificial Intelligence, Differential privacy, Enhanced Data Rates for GSM Evolution, computer, Implementation, Edge computing
Abstract: Edge computing is an emerging computing paradigm, which offers a great opportunity to implement data mining-based services and applications for a large number of devices and sensors in Internet of Things. However, the new paradigm is faced with security and privacy challenges due to the diversity and the limited capability of edge components. In particular, data privacy is one of the most concerned problems for all the participants. In this paper, we propose a framework of privacy-preserving data mining based on private random decision trees in edge computing, which not only gives the strong privacy guarantee, but also provides a certain amount of data utility. Firstly, we design a preservation framework to implement private random decision trees satisfying local differential privacy. Secondly, we present the concrete implementations of algorithms and the corresponding task that each participant needs to undertake. Thirdly, we analyze the key factors to influence privacy and utility, including the allocation of data and privacy budget. Fourthly, we give the improved algorithms to further increase the utility with strong privacy preservation. Finally, extensive experiments demonstrate the good performance of our designed framework.
Published: 2022

20. A collective neurodynamic approach for solving distributed system optimum dynamic traffic assignment problems

Author: Xinli Shi, Jinde Cao, and Xiangping Xu
Subjects: Scheme (programming language), Optimization problem, Computer science, Cognitive Neuroscience, Distributed computing, Control (management), Perspective (graphical), Telecommunications network, Computer Science Applications, Artificial Intelligence, Convergence (routing), computer, Protocol (object-oriented programming), Cell Transmission Model, computer.programming_language
Abstract: In this paper, the traditional system optimum dynamic traffic assignment (SO-DTA) problem is solved by distributed multi-agent dynamics. The goal of SO-DTA is to optimally control the route choice of each user to minimize the total travel time by all users over the assignment time period. It is beneficial for reducing the congestion of the traffic network with time-varying demand. Different from the traditional SO-DTA which is formulated and solved in a centralized scheme, we aim at solving it from a multi-agent perspective in a communication network. Based on the cell transmission model, two connector-based relaxed SO-DTA models are provided in a multi-agent system framework for the traffic network with a single destination and multiple destinations, respectively. In the provided models, each cell connector is treated as an agent, which could exchange information with its adjacent connectors to accomplish the system optimization objective. Then, a collective neurodynamic system equipped with the proposed distributed protocol is used to solve general network optimization problems including the above SO-DTA models as special cases. The convergence analysis is further given for the proposed algorithm. Numerical studies over various traffic networks are presented to show the effectiveness of the proposed method.
Published: 2022

21. Energy-efficient VM opening algorithms for real-time workflows in heterogeneous clouds

Author: Young-June Choi, Xin Dai, Tingrui Pei, Hiroo Sekiya, Saiqin Long, and Jiasheng Cao
Subjects: Computational complexity theory, Computer science, Cognitive Neuroscience, Energy consumption, computer.software_genre, Computer Science Applications, Scheduling (computing), Task (computing), Artificial Intelligence, Virtual machine, Frequency scaling, computer, Algorithm, Energy (signal processing), Efficient energy use
Abstract: Minimizing energy consumption is a critical challenge for real-time workflows, particularly in heterogeneous cloud computing systems. State-of-the-art algorithms aim to minimize the energy consumed for processing such applications by choosing virtual machines (VMs) to shut down from all opened VMs (i.e., VM merging). However, such VM merging through an ”on-to-close” approach usually incurs high computational complexity. This paper proposes an energy-efficient VM opening (EEVO) algorithm that is capable of choosing VMs to turn on from all closed VMs while satisfying the real-time constraint of applications. Considering that there are slacks that can be eliminated or reduced between adjacently scheduled tasks after using the EEVO algorithm, a dynamic scaling down EEVO algorithm (DEEVO) is further proposed. DEEVO is implemented by scaling down the frequency of VMs executing each task based on the dynamic voltage and frequency scaling (DVFS) technique. Experimental results demonstrate that, with the above-mentioned improvements, DEEVO achieves lower energy consumption for real-time workflows than state-of-the-art algorithms do. In addition, DEEVO outperforms state-of-the-art algorithms in the computational efficiency of accomplishing task scheduling.
Published: 2022

22. FedSim: Similarity guided model aggregation for Federated Learning

Author: Anjana Wijekoon, Chamath Palihawadana, Harsha Kumara Kalutarage, and Nirmalie Wiratunga
Subjects: Basis (linear algebra), Client data, Computer science, business.industry, Cognitive Neuroscience, Variance (accounting), Machine learning, computer.software_genre, Federated learning, Computer Science Applications, Model aggregation, Constraint (information theory), Study heterogeneity, Similarity (network science), Artificial Intelligence, Artificial intelligence, business, computer
Abstract: Federated Learning (FL) is a distributed machine learning approach in which clients contribute to learning a global model in a privacy preserved manner. Effective aggregation of client models is essential to create a generalised global model. To what extent a client is generalisable and contributing to this aggregation can be ascertained by analysing inter-client relationships. We use similarity between clients to model such relationships. We explore how similarity knowledge can be inferred from comparing client gradients, instead of inferring similarity on the basis of client data which violates the privacy-preserving constraint in FL. The similarity-guided FedSim algorithm, introduced in this paper, decomposes FL aggregation into local and global steps. Clients with similar gradients are clustered to provide local aggregations, which thereafter can be globally aggregated to ensure better coverage whilst reducing variance. Our comparative study also investigates the applicability of FedSim in both real-world datasets and on synthetic datasets where statistical heterogeneity can be controlled and studied systematically. A comparative study of FedSim with state-of-the-art FL baselines, FedAvg and FedProx, clearly shows significant performance gains. Our findings confirm that by exploiting latent inter-client similarities, FedSim’s performance is significantly better and more stable compared to both these baselines.
Published: 2022

23. Data-based decentralized learning scheme for nonlinear systems with mismatched interconnections

Author: Chaoxu Mu, Jiangwen Peng, Hao Luo, and Ke Wang
Subjects: Scheme (programming language), Mathematical optimization, Artificial neural network, Computer science, Cognitive Neuroscience, Optimal control, Decentralised system, Computer Science Applications, System dynamics, Electric power system, Nonlinear system, Artificial Intelligence, Reinforcement learning, computer, computer.programming_language
Abstract: In this paper, the decentralized learning scheme for nonlinear systems with mismatched interconnections is developed by using the off-policy integral reinforcement leaning algorithm. First, the decentralized control of the overall system is transformed into the optimal control of each subsystem by introducing an auxiliary control. In order to relax the knowledge of system dynamics, a model-free policy iteration algorithm is derived based on the off-policy integral reinforcement learning. Then, the model-free policy iteration algorithm is used to solve the related Hamilton-Jacobi-Bellman equations, where only the collected system data is required. For implementation purpose, neural networks are employed to approximate the optimal cost functions and the optimal control policies, respectively. Moreover, the least squares method and the experience replay technique are combined to learn neural network weights. Finally, a mismatched interconnected system and a photovoltaic power system are presented to verify the effectiveness of the proposed algorithm.
Published: 2022

24. SA-CGAN: An oversampling method based on single attribute guided conditional GAN for multi-class imbalanced learning

Author: Yao Dong, Huaxin Xiao, and Yongfeng Dong
Subjects: Class (computer programming), Computer science, business.industry, Test data generation, Cognitive Neuroscience, Binary number, Construct (python library), Machine learning, computer.software_genre, Minimax, Filter (higher-order function), Computer Science Applications, Artificial Intelligence, Oversampling, Artificial intelligence, business, computer, Generative grammar
Abstract: Imbalanced data can always be observed in our daily life and various practical tasks. A lot of well-constructed machine learning methodologies may produce ineffective performance, when conducted on this kind of data. This originates from the produced high training biases that towards the majority class instances. Among all the solutions of this problem, data generation of the minority class is always considered the most effective approach. However, in all the previous works, data are always processed sample-wisely and the distribution of each single data attribute is never noticed. So, in this paper, to estimate the mechanism of how each attribute contributes to its label, we explore the potential connection between the two items by Conditional Generative Adversarial Networks (CGAN) separately and individually. Then, the constructed new instances are purified by a designed attribute-based minimax filter and the survivors are concatenated to form the eventual generated data. In other words, different from the CGAN based data generation way, the proposed approach improves it by additionally considering all the single attribute patterns of the data that to construct new instances. In addition, we extend the binary class imbalanced learning framework to multiple class one. In the experimental part, the improved model is compared against GAN, CGAN and some other standard multiple-class oversampling algorithms on several widely used datasets. Results, in terms of four common measurements, have shown that the proposed approach can produce comparable and always superior performance when compared with the competitors.
Published: 2022

25. OPTDP: Towards optimal personalized trajectory differential privacy for trajectory data publishing

Author: Wang Miao, Chen Wang, Ruxue Wen, Haojun Huang, and Wenqing Cheng
Subjects: Mobility model, Matching (statistics), Computer science, Cognitive Neuroscience, Probabilistic logic, Data publishing, computer.software_genre, Computer Science Applications, Information sensitivity, Semantic similarity, Artificial Intelligence, Trajectory, Differential privacy, Data mining, computer
Abstract: With the development of location-based applications, more and more trajectory data are collected. Trajectory data often contains users’ sensitive information, and direct release it may pose a threat to users’ privacy. Differential privacy, as a privacy preserving method with solid mathematical foundation, has been widely used in trajectory data publishing. However, current trajectory data publishing methods based on differential privacy cannot fully realize the personalized privacy protection. In this paper, an optimal personalized trajectory differential privacy mechanism is proposed. Firstly, by establishing the probabilistic mobility model of trajectories, we cluster the locations to achieve semantic location matching between different trajectories. Based on the semantic similarity, we identify the templet trajectory, and propose a privacy level allocation method based on stay-points and frequent sub-trajectories. Then, according to the location matching results, we can automatically identify the privacy level of all locations. Combined with the optimal location differential privacy mechanism, we disturb the location points on the user’s trajectory before publishing, where different location privacy levels correspond to different privacy budgets. Experiment results on real-world datasets show that our mechanism provides a better tradeoff between privacy protection and data utility compared with traditional differential privacy methods.
Published: 2022

26. A survey of crowd counting and density estimation based on convolutional neural network

Author: Yaowei Wang, Yudong Zhang, Hong Zhang, Zizhu Fan, Zheng Zhang, and Guangming Lu
Subjects: Computer science, business.industry, Cognitive Neuroscience, Frame (networking), Density estimation, Machine learning, computer.software_genre, Convolutional neural network, Field (computer science), Computer Science Applications, Artificial Intelligence, Data_GENERAL, Benchmark (computing), Artificial intelligence, Estimation methods, business, Focus (optics), computer, Crowd counting
Abstract: Crowd counting and crowd density estimation methods are of great significance in the field of public security. Estimating crowd density and counting from single image or video frame has become an essential part of a computer vision system in various scenarios. In this paper, we comprehensively review the recent research advancement on crowd counting and density estimation. First of all, we introduce the background of crowd counting and crowd density estimation. Second, the traditional crowd counting methods are summarized. Third, we focus on reviewing the crowd counting and crowd density methods based on Convolutional Neural Network (CNN) models. Next, we report and discuss the experimental results of a number of typical methods on benchmark datasets. Finally, we present the promising future directions of crowd counting and crowd density.
Published: 2022

27. Chinese named entity recognition: The state of the art

Author: Fenglei Wang, Yanming Guo, Guohui Li, and Pan Liu
Subjects: Computer science, business.industry, Cognitive Neuroscience, Deep learning, Representation (systemics), Context (language use), computer.software_genre, Computer Science Applications, Focus (linguistics), Named-entity recognition, Artificial Intelligence, Artificial intelligence, Architecture, business, Encoder, computer, Natural language, Natural language processing
Abstract: Named Entity Recognition(NER), one of the most fundamental problems in natural language processing, seeks to identify the boundaries and types of entities with specific meanings in natural language text. As an important international language, Chinese has uniqueness in many aspects, and Chinese NER (CNER) is receiving increasing attention. In this paper, we give a comprehensive survey of recent advances in CNER. We first introduce some preliminary knowledge, including the common datasets, tag schemes, evaluation metrics and difficulties of CNER. Then, we separately describe recent advances in traditional research and deep learning research of CNER, in which the CNER with deep learning is our focus. We summarize related works in a basic three-layer architecture, including character representation, context encoder, and context encoder and tag decoder. Meanwhile, the attention mechanism and adversarial-transfer learning methods based on this architecture are introduced. Finally, we present the future research trends and challenges of CNER.
Published: 2022

28. Heterogeneous graph driven unsupervised domain adaptation of person re-identification

Author: Wei-Shi Zheng, Jianming Lv, Qing Li, Shaochuan Lin, and Zhenguo Yang
Subjects: Scheme (programming language), Similarity (geometry), Computer science, business.industry, Cognitive Neuroscience, Pattern recognition, Computer Science Applications, Domain (software engineering), Artificial Intelligence, Core (graph theory), Graph (abstract data type), Unsupervised learning, Affinity propagation, Artificial intelligence, business, Classifier (UML), computer, computer.programming_language
Abstract: How to incrementally optimize a pre-trained classifier in an unlabeled target domain is a core challenging problem of domain adaptation (DA) for many visual tasks, such as Person Re-identification (re-ID). Most of the existing methods optimize the model based on pseudo labels or similarity of instance pairs, but ignoring the diverse manifold structures of unlabeled instances in the whole dataset. In this paper, we address the importance of such structural information in domain adaptation, and propose a Heterogeneous Graph driven Optimization scheme, namely H-GO, for structure based unsupervised learning. In particular, H-GO builds a heterogeneous graph of unlabeled images to consider the heterogeneous properties of images from various cameras with varied visual styles. A heterogeneous affinity propagation method is further applied to explore the graph based affinity between the instances which share similar manifold structures. Finally, a heterogeneous affinity learning procedure is taken to optimize the visual models by using the graph based affinity of instances. Comprehensive experiments are conducted on three large-scale re-ID datasets, and the results demonstrate the flexibility and the superior performance of H-GO than state-of-the-art unsupervised domain adaptation algorithms.
Published: 2022

29. A deep neural network ensemble of multimodal signals for classifying excavator operations

Author: Jin Young Kim and Sung-Bae Cho
Subjects: 0209 industrial biotechnology, Artificial neural network, Computer science, Cognitive Neuroscience, Feature vector, 02 engineering and technology, computer.software_genre, Computer Science Applications, Weighting, Excavator, 020901 industrial engineering & automation, Artificial Intelligence, Feature (computer vision), 0202 electrical engineering, electronic engineering, information engineering, Prognostics, 020201 artificial intelligence & image processing, Data mining, computer
Abstract: The prognostics and health management (PHM) aims to provide a comprehensive solution for equipment health care. Classifying the operation mode of excavator, one of the tasks in the PHM, is important to evaluate the remaining useful lifetime. Several studies have been conducted to classify the operations with either video or sensor data, but they have several limitations to use only one type of data. A model trained with sensor data cannot classify the similar operations such as “digging” and “ditch digging”, whereas a model with video data is vulnerable to surrounding condition like weather. In this paper, to overcome these shortcomings, we propose a deep neural network ensemble called FusionNet that classifies the operations of excavator. Two models are trained with sensor data and video frames respectively, where the feature extractors are transferred to the FusionNet. The proposed network ensemble performs a flexible and well-optimized classification by automatically calculating weights according to the extracted feature vectors and combining them. To verify the proposed model, several experiments are conducted with the real-world data. The proposed model achieves the accuracy of 99.17% which outperforms the conventional methods. We also confirm that the proposed model can address the shortcomings of using only one type of data and maximize the benefits through the automatic weighting of extracted features.
Published: 2022

30. TensorClus: A python library for tensor (Co)-clustering

Author: Rafika Boutalbi, Mohamed Nadif, Lazhar Labiod, University of Stuttgart, CB - Centre Borelli - UMR 9010 (CB), Service de Santé des Armées-Institut National de la Santé et de la Recherche Médicale (INSERM)-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)-Ecole Normale Supérieure Paris-Saclay (ENS Paris Saclay)-Université Paris Cité (UPCité), and Boutalbi, Rafika
Subjects: [INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI], Computer science, Interface (Java), Cognitive Neuroscience, Tensors, [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI], Computational science, Biclustering, [STAT.ML]Statistics [stat]/Machine Learning [stat.ML], [INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG], Artificial Intelligence, Tensor (intrinsic definition), Multiple Graphs, (Co)-clustering, Cluster analysis, computer.programming_language, Block (data storage), NumPy, [INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG], Python (programming language), [STAT.ML] Statistics [stat]/Machine Learning [stat.ML], Computer Science Applications, Tensor Decomposition, Computer Science::Mathematical Software, Unsupervised learning, computer
Abstract: International audience; Tensor data analysis is the evolutionary step of data analysis to more than two dimensions. Dealing with tensor data is often based on tensor decomposition methods. The present paper focuses on unsupervised learning and provides a python package referred to as TensorClus including novel co-clustering algorithms of three-way data. All proposed algorithms are based on the latent block models and suitable to different types of data, sparse or not. They are successfully evaluated on challenges in text mining, recommender systems, and hyperspectral image clustering. TensorClus is an open-source Python package that allows easy interaction with other python packages such as NumPy and TensorFlow; it also offers an interface with some tensor decomposition packages namely Tensorly and TensorD on the one hand, and on the other, the co-clustering package Coclust. Finally, it provides CPU and GPU compatibility. The TensorClus library is available at https://pypi.org/project/TensorClus/ 1 .
Published: 2022

31. Hierarchical multimodal transformer to summarize videos

Author: Maoguo Gong, Xuelong Li, and Bin Zhao
Subjects: Scheme (programming language), Closed captioning, Machine translation, Computer science, Cognitive Neuroscience, Frame (networking), computer.software_genre, Automatic summarization, Computer Science Applications, Task (project management), Recurrent neural network, Artificial Intelligence, Data mining, computer, computer.programming_language, Transformer (machine learning model)
Abstract: Although video summarization has achieved tremendous success benefiting from Recurrent Neural Networks (RNN), RNN-based methods neglect the global dependencies and multi-hop relationships among video frames, which limits the performance. Transformer is an effective model to deal with this problem, and surpasses RNN-based methods in several sequence modeling tasks, such as machine translation, video captioning, etc. Motivated by the great success of transformer and the natural structure of video (frame-shot-video), a hierarchical transformer is developed for video summarization, which can capture the dependencies among frame and shots, and summarize the video by exploiting the scene information formed by shots. Furthermore, we argue that both the audio and visual information are essential for the video summarization task. To integrate the two kinds of information, they are encoded in a two-stream scheme, and a multimodal fusion mechanism is developed based on the hierarchical transformer. In this paper, the proposed method is denoted as Hierarchical Multimodal Transformer (HMT). Practically, extensive experiments show that HMT achieves (F-measure: 0.441, Kendall’s τ : 0.079, Spearman’s ρ : 0.080) and (F-measure: 0.601, Kendall’s τ : 0.096, Spearman’s ρ : 0.107) on SumMe and TVsum, respectively. It surpasses most of the traditional, RNN-based and attention-based video summarization methods.
Published: 2022

32. Co-attention network with label embedding for text classification

Author: Minqian Liu, Qing Du, Junyi Cao, and Lizhao Liu
Subjects: Focus (computing), Exploit, business.industry, Computer science, Cognitive Neuroscience, Construct (python library), Space (commercial competition), Machine learning, computer.software_genre, Computer Science Applications, ComputingMethodologies_PATTERNRECOGNITION, Discriminative model, Artificial Intelligence, Embedding, Artificial intelligence, State (computer science), Representation (mathematics), business, computer
Abstract: Most existing methods for text classification focus on extracting a highly discriminative text representation, which, however, is typically computationally inefficient. To alleviate this issue, label embedding frameworks are proposed to adopt the label-to-text attention that directly uses label information to construct the text representation for more efficient text classification. Although these label embedding methods have achieved promising results, there is still much space for exploring how to use the label information more effectively. In this paper, we seek to exploit the label information by further constructing the text-attended label representation with text-to-label attention. To this end, we propose a Co-attention Network with Label Embedding (CNLE) that jointly encodes the text and labels into their mutually attended representations. In this way, the model is able to attend to the relevant parts of both. Experiments show that our approach achieves competitive results compared with previous state of-the-art methods on 7 multi-class classification benchmarks and 2 multi-label classification benchmarks.
Published: 2022

33. A graph convolutional topic model for short and noisy text streams

Author: Ngo Van Linh, Tran Xuan Bach, and Khoat Than
Subjects: FOS: Computer and information sciences, Topic model, Computer Science - Machine Learning, Concept drift, Computer science, Data stream mining, business.industry, Cognitive Neuroscience, Probabilistic logic, WordNet, Machine Learning (stat.ML), Noisy text, Machine learning, computer.software_genre, Machine Learning (cs.LG), Computer Science Applications, Statistics - Machine Learning, Artificial Intelligence, Graph (abstract data type), Word2vec, Artificial intelligence, business, computer
Abstract: Learning hidden topics from data streams has become absolutely necessary but posed challenging problems such as concept drift as well as short and noisy data. Using prior knowledge to enrich a topic model is one of potential solutions to cope with these challenges. Prior knowledge that is derived from human knowledge (e.g. Wordnet) or a pre-trained model (e.g. Word2vec) is very valuable and useful to help topic models work better. However, in a streaming environment where data arrives continually and infinitely, existing studies are limited to exploiting these resources effectively. Especially, a knowledge graph, that contains meaningful word relations, is ignored. In this paper, to aim at exploiting a knowledge graph effectively, we propose a novel graph convolutional topic model (GCTM) which integrates graph convolutional networks (GCN) into a topic model and a learning method which learns the networks and the topic model simultaneously for data streams. In each minibatch, our method not only can exploit an external knowledge graph but also can balance the external and old knowledge to perform well on new data. We conduct extensive experiments to evaluate our method with both a human knowledge graph (Wordnet) and a graph built from pre-trained word embeddings (Word2vec). The experimental results show that our method achieves significantly better performances than state-of-the-art baselines in terms of probabilistic predictive measure and topic coherence. In particular, our method can work well when dealing with short texts as well as concept drift. The implementation of GCTM is available at \url{https://github.com/bachtranxuan/GCTM.git}.
Published: 2022

34. Enhancing structure modeling for relation extraction with fine-grained gating and co-attention

Author: Yongfeng Huang, Yubo Chen, and Chuhan Wu
Subjects: Structure (mathematical logic), Sequence, Dependency (UML), Artificial neural network, Exploit, Computer science, business.industry, Cognitive Neuroscience, Gating, computer.software_genre, Relationship extraction, Computer Science Applications, Artificial Intelligence, Benchmark (computing), Artificial intelligence, business, computer, Natural language processing
Abstract: Relation extraction is a critical natural language processing task. Existing dependency-based models captured long-range syntactic relations, but they usually cannot fully exploit information from sentences. They often used hand-crafted rules to prune redundant edges from dependency trees, but suffer from the imbalance of including and removing contents. When incorporating sequence models, they usually ignored the semantic and syntactic interactions between words. In this paper, we propose to automatically learn relational dependency structures with a fine-grained gating strategy. We decompose the dependency tree into differently informative parts and apply different gating methods to each part. To further capture the word-level interactions, we propose to apply the co-attention mechanism to combine structure and sequence models. We apply a neural network to learn the affinity matrix and derive mutual attention weights between semantic and syntactic representations. We conduct experiments on two benchmark datasets and the results indicate the effectiveness of our method.
Published: 2022

35. Multi-perspective social recommendation method with graph representation learning

Author: Hai Liu, Duantengchuan Li, Ke Lin, Jiazhang Wang, Zhaoli Zhang, Neal N. Xiong, Chao Zheng, and Xiaoxuan Shen
Subjects: Information retrieval, Computer science, Cognitive Neuroscience, Rationality, Recommender system, Python (programming language), Social relation, Computer Science Applications, Artificial Intelligence, Graph (abstract data type), Construct (philosophy), computer, Feature learning, computer.programming_language, Social influence
Abstract: Social recommender systems (SRS) aim to study how social relations influence users’ choices and how to use them for better learning users embeddings. However, the diversity of social relationships, which is instructive to the propagation of social influence, has been rarely explored. In this paper, we propose a graph convolutional network based representation learning method, namely multi-perspective social recommendation (MPSR), to construct hierarchical user preferences and assign friends’ influences with different levels of trust at varying perspectives. We further utilize the attributes of items to partition and excavate users’ explicit preferences and employ complementary perspective modeling to learn implicit preferences of users. To measure the trust degree of friends from different perspectives, the statistical information of users’ historical behavior is utilized to construct multi-perspective social networks. Experimental results on two public datasets of Yelp and Ciao demonstrate that the MPSR significantly outperforms the state-of-the-art methods. Further detailed analysis verifies the importance of mining explicit characteristics of users and the necessity for diverse social relationships, which show the rationality and effectiveness of the proposed model. The source Python code will be available upon request.
Published: 2022

36. STCM-Net: A symmetrical one-stage network for temporal language localization in videos

Author: Minglin Dong, Jingyu Ru, Sikai Yang, Chunbo Li, Lele Xue, and Zixi Jia
Subjects: Computer science, business.industry, Cognitive Neuroscience, Process (computing), Concept mining, Semantics, computer.software_genre, Field (computer science), Computer Science Applications, Task (project management), Artificial Intelligence, Language localisation, Artificial intelligence, business, computer, Sentence, Natural language, Natural language processing
Abstract: The task of temporal language localization in the video is to locate a video segment through natural language description for an untrimmed video. Compared with the general video localization task, it is more flexible and complex, which can accurately locate various scenes described by any natural language without making video labels in advance. It can be widely used for the field such as video retrieval and robot intelligent cognition. The main challenges of this task are the extraction of sentence semantics and the integration of contextual information in videos. Among them, contextual video integration can be optimized through the two-dimensional temporal adjacent network. Therefore, complete extraction of the potential information in the query sentence is necessary to solve the task more granularly. At the same time, we found a large amount of time-related information in the query sentence, which helps improve the localization accuracy. Thus, in this paper, we first define the time concept in a sentence and then propose a Sentence Time Concept Mining Network (STCM-Net), an symmetrical one-stage network. Can effectively extract the time concept contained in the query sentence, it can optimize the process of target localization and improve the localization performance. We also evaluate the proposed STCM-Net on three challenging public benchmarks: Charades-STA, ActivityNet Captions, and TACoS. Our STCM-Net gets encouraging improvements compared with the state-of-the-art approaches.
Published: 2022

37. Wavelet extreme learning machine and deep learning for data classification

Author: Salwa Said, Mourad Zaied, and Siwar Yahia
Subjects: 0209 industrial biotechnology, business.industry, Generalization, Computer science, Cognitive Neuroscience, Deep learning, Activation function, Data classification, PID controller, 02 engineering and technology, Machine learning, computer.software_genre, Computer Science Applications, 020901 industrial engineering & automation, Wavelet, Artificial Intelligence, 0202 electrical engineering, electronic engineering, information engineering, Benchmark (computing), 020201 artificial intelligence & image processing, Artificial intelligence, business, computer, MNIST database, Extreme learning machine
Abstract: Recently, the Extreme Learning Machine (ELM) algorithm has been applied to various fields due to its rapidity and significant generalization performance. Traditionally, deep learning (DL) and wavelet neural networks (WNN) methods reach a high classification accuracy in machine learning applications. As a result, a new structure based on WNN, deep architecture and ELM is proposed in this paper. The proposed method is based on Extreme Learning Machine Auto-Encoder with DL structure and a composite wavelet activation function used in the hidden nodes. To evaluate the performance of our approach, we used standard benchmark data-sets, namely COIL-20, Pima Indian Diabetes (PID), MNIST and EMNIST. Experimental results show that our method offers satisfactory results and performance compared to other approaches.
Published: 2022

38. A generic framework for deep incremental cancelable template generation

Author: Aditya Nigam, Chirag Vashist, Avantika Singh, and Pratyush Gaurav
Subjects: Biometrics, business.industry, Computer science, Cognitive Neuroscience, Deep learning, Data_MISCELLANEOUS, Hash function, Usability, computer.software_genre, Convolutional neural network, Computer Science Applications, Discriminative model, Artificial Intelligence, Key (cryptography), Data mining, Artificial intelligence, business, computer, Subspace topology
Abstract: In a post-COVID-19 world, extensive study of deep learning-based biometric authentication techniques prompts the need to secure them. Further, the biometric data is assumed to be largely immutable; thus, if it is compromised, it is lost forever. Hence, reliable and secure biometric authentication is of utmost importance. In this paper, we address the security and privacy concerns of biometric templates generated via deep networks. We propose a cancelable biometric authentication approach. The framework consists of a lightweight Convolutional Neural Network (CNN) with a few-shot enrollment for generating biometric templates. Further, to enhance biometric templates’ discriminative power and to provide revocability, biometric templates are projected onto a random subspace (based on the user-specific key). Later projected biometric templates are mapped onto robust n - bit unique codes (using a KNN classifier) and protected via. SHA-3 hash digest. Moreover, a real-world biometric authentication system is always dynamic (users keep on changing). Thus we have also integrated phase-wise incremental learning within a deep learning-based cancelable biometric authentication framework. This is the first work in which deep cancelable templates are generated incrementally to the best of our knowledge. We analyze the proposed scheme for its performance and privacy preservation on three benchmarks constrained iris data-sets and over one unconstrained iris data-set along with one publicly available knuckle data-set. Furthermore, it has been demonstrated that the proposed cancelable incremental framework strictly follows the four fundamental properties of cancelability viz. non-invertibility, unlinkability, revocability, and usability.
Published: 2022

39. Hierarchical gate network for fine-grained visual recognition

Author: Mingli Song, Ying Chen, and Jie Song
Subjects: Structure (mathematical logic), Interconnection, Network architecture, Hierarchy (mathematics), Exploit, Computer science, Cognitive Neuroscience, Context (language use), computer.software_genre, Computer Science Applications, Artificial Intelligence, Benchmark (computing), Data mining, Architecture, computer
Abstract: The visual classification has achieved unprecedented progress in the last decade, and miscellaneous network architectures have emerged. However, these models yield inferior performance when deployed in fine-grained classification problems, as they are usually devised by enlarging the model capacity or facilitating the optimization, and few concentrate on the problem itself. In this paper, we argue that in most fine-grained classification problems, concepts are intrinsically hierarchically structured rather than evenly distributed, and thus classifying all concepts within a single layer simultaneously deteriorates the discrimination among different categories. Furthermore, the category hierarchy is usually not provided, which fails some existing methods where the human-defined hierarchy is required. In order to tackle these challenges, we propose a new architecture, referred to as Hierarchical Gate Network (HGNet), to exploit the interconnection among hierarchical categories. HGNet adopts an LSTM-like mechanism to transmit dependencies among classes of different levels in the hierarchy. In such a way, the context information in the hierarchical structure is utilized to boost the recognition performance. Experiments conducted on various benchmark datasets, including CUB-200-2011, Stanford Dogs, NABirds, Aircraft, iNaturalist, DeepFashion and DeepFashion2, demonstrate the superiority of the proposed method to the state-of-the-art algorithms.
Published: 2022

40. Distributed Bayesian optimisation framework for deep neuroevolution

Author: Animesh Tiwari and Rohitash Chandra
Subjects: Neuroevolution, Artificial neural network, Computer science, business.industry, Cognitive Neuroscience, Deep learning, Computer Science::Neural and Evolutionary Computation, Bayesian probability, Machine learning, computer.software_genre, Convolutional neural network, Computer Science Applications, Surrogate model, Artificial Intelligence, Feature (machine learning), Reinforcement learning, Artificial intelligence, business, computer
Abstract: Neuroevolution is a machine learning method for evolving neural networks parameters and topology with a high degree of flexibility that makes them applicable to a wide range of architectures. Neuroevolution has been popular in reinforcement learning and has also shown to be promising for deep learning. The major feature of Bayesian optimisation is in reducing computational load by approximating the actual model with an acquisition function (surrogate model) that is computationally cheaper. A major limitation of neuroevolution is the high computational time required for convergence since learning (evolution) typically does not utilize gradient information. Bayesian optimisation, which is also known as surrogate-assisted optimisation, has been popular for expensive engineering optimisation problems and hyper-parameter tuning in machine learning. It has potential for training deep learning models via neuroevolution given large datasets and complex models. Recent advances in parallel and distributed computing have enabled efficient implementation of neuroevolution for complex and computationally expensive neural models. In this paper, we present a Bayesian optimisation framework for deep neuroevolution using a distributed architecture to provide computational efficiency in training. Our results demonstrate promising results for simple to deep neural network models such as convolutional neural networks which motivates further applications.
Published: 2022

41. Hybrid interpretable predictive machine learning model for air pollution prediction

Author: Yuanlin Gu, Baihua Li, and Qinggang Meng
Subjects: Structure (mathematical logic), Artificial neural network, business.industry, Computer science, Cognitive Neuroscience, Air pollution, Feature selection, Machine learning, computer.software_genre, medicine.disease_cause, Nonlinear auto regressive moving average, Computer Science Applications, Correlation, Artificial Intelligence, medicine, Artificial intelligence, Time series, business, computer, Interpretability
Abstract: Air pollution prediction is a burning issue, as pollutants can harm human health. Traditional machine learning models usually aim to improve the overall prediction accuracy but neglect the accuracy for peak values. Moreover, these models are not interpretable. They fail to explain the interactions between various determining factors and their impacts on air pollution. In this paper, we propose a new Hybrid Interpretable Predictive Machine Learning model for the Particulate Matter 2.5 prediction, which carries two novelties. First, a hybrid model structure is constructed with deep neural network and Nonlinear Auto Regressive Moving Average with Exogenous Input model. Second, automatic feature generation and feature selection procedures are integrated into this hybrid model. The experimental results demonstrate the superiority of our model over other models in prediction accuracy for peak values and model interpretability. The proposed model reveals how PM2.5 prediction is estimated by historical PM2.5, weather, and season. The accuracies (measured by correlation coefficients) of 1, 3 and 6-hour-ahead prediction are 0.9870, 0.9332 and 0.8587, respectively. More importantly, the proposed approach presents a new interpretable machine learning framework for time series data, enabling to explain complex dependence of multimode inputs, and to build reliable predictive models.
Published: 2022

42. An introduction to Deep Learning in Natural Language Processing: Models, techniques, and tools

Author: Ivano Lauriola, Fabio Aiolli, and Alberto Lavelli
Subjects: Transformer, Focus (computing), business.industry, Computer science, Cognitive Neuroscience, Deep learning, Human language, computer.software_genre, ComputingMethodologies_ARTIFICIALINTELLIGENCE, Computer Science Applications, Deep Learning, ComputingMethodologies_PATTERNRECOGNITION, Software, Artificial Intelligence, Language Models, Natural Language Processing, Artificial intelligence, business, computer, Natural language processing
Abstract: Natural Language Processing (NLP) is a branch of artificial intelligence that involves the design and implementation of systems and algorithms able to interact through human language. Thanks to the recent advances of deep learning, NLP applications have received an unprecedented boost in performance. In this paper, we present a survey of the application of deep learning techniques in NLP, with a focus on the various tasks where deep learning is demonstrating stronger impact. Additionally, we explore, describe, and revise the main resources in NLP research, including software, hardware, and popular corpora. Finally, we emphasize the main limits of deep learning in NLP and current research directions.
Published: 2022

43. KAICD: A knowledge attention-based deep learning framework for automatic ICD coding

Author: Min Zeng, Ying Yu, Yifan Wu, Zhihui Fei, Min Li, and Fang-Xiang Wu
Subjects: 0209 industrial biotechnology, Computer science, business.industry, Cognitive Neuroscience, Deep learning, Feature extraction, 02 engineering and technology, computer.software_genre, Convolutional neural network, Computer Science Applications, 020901 industrial engineering & automation, Knowledge base, Artificial Intelligence, Intensive care, Health care, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Artificial intelligence, business, computer, Natural language processing, Coding (social sciences)
Abstract: Automatic International Classification of Diseases (ICD) coding is an important task in the future of artificial intelligence healthcare. In recent years, a lot of traditional machine learning-based methods have been proposed, and they achieved good results on this task. However, these traditional machine learning-based methods for automatic ICD coding only focus on the semantic features of clinical notes and ignore the feature extraction of ICD titles that are the descriptions of ICD codes. In this paper, we propose a knowledge attention-based deep learning framework called KAICD for automatic ICD coding. KAICD makes full use of the clinic notes and the ICD titles. The semantic features of clinic notes are extracted by a multi-scale convolutional neural network. For ICD titles, we use attention-based Bidirectional Gated Recurrent Unit (Bi-GRU) to build a knowledge database, which can offer additional information. Depending on input clinic notes, we can use the attention mechanism to obtain different knowledge vectors from the knowledge database where some ICD titles are more relevant to the input clinic notes. Last, we concatenate the knowledge vectors and the semantic features of clinic notes, and use them for the final prediction. KAICD is tested on a public dataset Medical Information Mart for Intensive Care III (MIMIC III); it achieves micro-precision of 0.502, micro-recall of 0.428, and micro-f1 of 0.462, which outperforms other competing methods. Furthermore, the results of the ablation study show that the knowledge database of ICD titles learned by the attention-based Bi-GRU enhances the feature expression and improves the prediction performance.
Published: 2022

44. Joint network embedding of network structure and node attributes via deep autoencoder

Author: Zhisong Pan, Junyang Qiu, Shuaihui Wang, Junhua Zou, Guyu Hu, and Yu Pan
Subjects: Structure (mathematical logic), Similarity (geometry), Computer science, Cognitive Neuroscience, Node (networking), Topology (electrical circuits), Construct (python library), computer.software_genre, Autoencoder, Computer Science Applications, Artificial Intelligence, Pairwise comparison, Data mining, Representation (mathematics), computer
Abstract: Network embedding aims to learn a low-dimensional vector for each node in networks, which is effective in a variety of applications such as network reconstruction and community detection. However, the majority of the existing network embedding methods merely exploit the network structure and ignore the rich node attributes, which tend to generate sub-optimal network representation. To learn more desired network representation, diverse information of networks should be exploited. In this paper, we develop a novel deep autoencoder framework to fuse topological structure and node attributes named FSADA. We firstly design a multi-layer autoencoder which consists of multiple non-linear functions to capture and preserve the highly non-linear network structure and node attribute information. Particularly, we adopt a pre-processing procedure to pre-process the original information, which can better facilitate to extract the intrinsic correlations between topological structure and node attributes. In addition, we design an enhancement module that combines topology and node attribute similarity to construct pairwise constraints on nodes, and then a graph regularization is introduced into the framework to enhance the representation in the latent space. Our extensive experimental evaluations demonstrate the superior performance of the proposed method.
Published: 2022

45. Multimodal sentiment analysis with unidirectional modality translation

Author: Bo Yang, Xiaola Lin, Bo Shao, and Lijun Wu
Subjects: Focus (computing), Modalities, Modality (human–computer interaction), Computer science, business.industry, Cognitive Neuroscience, media_common.quotation_subject, Sentiment analysis, computer.software_genre, Computer Science Applications, Artificial Intelligence, Benchmark (computing), Quality (business), Artificial intelligence, business, computer, Encoder, Natural language processing, media_common, Transformer (machine learning model)
Abstract: Multimodal Sentiment Analysis (MSA) is a challenging research area that investigates sentiment expressed from multiple heterogeneous sources of information. To integrate multimodal information including text, visual and audio modalities, state-of-the-art models focus on developing various fusion strategies, such as attention and outer product. However, the inferior quality of visual and audio features that is commonly observed in this area has not aroused much attention. We argue that this issue will obstruct the performance of the fusion strategies to a considerable extent. Therefore, in this paper, we propose Multimodal Translation for Sentiment Analysis (MTSA), a multimodal framework that improves the quality of visual and audio features by translating them to text features extracted by Bidirectional Encoder Representations from Transformers (BERT). Experiments on two benchmark datasets CMU-MOSI and CMU-MOSEI show that our model performs better than the state-of-the-art methods on both datasets across all the metrics, which illustrates the effectiveness of our method.
Published: 2022

46. Cooperative density-aware representation learning for few-shot visual recognition

Author: Huiqun Yu, Mengqi Gao, Xiang Feng, and Zijun Zheng
Subjects: Computer science, business.industry, Cognitive Neuroscience, Deep learning, Mutual information, Machine learning, computer.software_genre, Convolutional neural network, Computer Science Applications, Discriminative model, Artificial Intelligence, Feature (machine learning), Artificial intelligence, Representation (mathematics), Set (psychology), business, computer, Feature learning
Abstract: Few-shot visual recognition has achieved remarkable advances along with the rise of deep learning. Its goal is to learn the model parameter from the base category for transferring it to the novel category with limited annotations. However, most of the existing few-shot visual recognition approaches mainly focus on extracting a global feature representation of the sample, which fails to encode the semantic information. To alleviate this issue, this paper presents a novel cooperative density-aware representation learning approach for few-shot visual recognition. Specifically, we first yield the high-level semantic features of the query set and the support set by leveraging a shared convolutional neural network. A cooperative density loss module is then designed to optimize the model to form the discriminative features by incorporating the density global classification loss and the density few-shot loss. The density few-shot loss conducts the semantic alignment with regional features by the mutual information finding manner while the density global classification loss supervises each regional feature lead to more precise classification. Comprehensive experiments in few-shot visual recognition benchmarks validate the effectiveness and superiority of our proposed approach, and elaborate ablations explain the utility of different modules.
Published: 2022

47. Fuzzy entity alignment via knowledge embedding with awareness of uncertainty measure

Author: Xinyang Deng, Liu Yuanna, and Wen Jiang
Subjects: Structure (mathematical logic), Similarity (geometry), Computer science, Cognitive Neuroscience, media_common.quotation_subject, Continuous embedding, Ambiguity, computer.software_genre, Fuzzy logic, Computer Science Applications, Set (abstract data type), Artificial Intelligence, Metric (mathematics), Embedding, Data mining, computer, media_common
Abstract: Entity alignment refers to associate entities in different knowledge graphs if they are semantically identical. Embedding-based entity alignment approaches encode entities in a continuous embedding space where entities are aligned based on the similarity of learned embeddings. However, there exists ambiguity and uncertainty in entity alignment caused by single alignment metric. In this paper, a fuzzy entity alignment method FuzzyEA is proposed to model the uncertainty in alignment process based on intuitionistic fuzzy set (IFS). Iterative TransE model is designed to learn relational structure of knowledge graphs, where mutual selection and error correction mechanism is proposed to enhance the effect of iteration. The alignment results obtained by name/description embedding and structure embedding are fused based on Dempster’s combination rule. Experiments on three benchmark datasets demonstrate that the proposed FuzzyEA consistently outperforms other entity alignment methods and contributes to promising improvement in alignment accuracy and discrimination ability.
Published: 2022

48. Domain adaptive twin support vector machine learning using privileged information

Author: Yanmeng Li, Wenzhu Yan, and Huaijiang Sun
Subjects: Computational complexity theory, Computer science, business.industry, Cognitive Neuroscience, Quadratic function, Machine learning, computer.software_genre, Computer Science Applications, Domain (software engineering), Support vector machine, Hyperplane, Discriminant, Artificial Intelligence, Classifier (linguistics), Quadratic programming, Artificial intelligence, business, computer
Abstract: In the fields of computer vision and machine learning, domain adaptation has been extensively studied and the main challenge in the case is how to transform the existing classifier(s) into an effective adaptive classifier to exploit the latent information in the new data source which typically has a different distribution compared with the original data source. Currently, the Adaptive Support Vector Machines (A-SVM) has been proposed to deal with the domain adaptation problem, which is an effective strategy. However, the resulting optimization task by minimizing a convex quadratic function in A-SVM can not effectively minimize the distance between a source and a target domain as much as possible and typically has high computational complexity. In order to handle these problems, in this paper, we extend the A-SVM by determining a pair of nonparallel up- and down-bound functions solved by two smaller sized quadratic programming problems (QPPs) to achieve a faster learning speed. Notably, our method yields two nonparallel separating hyperplanes to exploit the latent discriminant information based on SVM classification mechanism, which can naturally enhance the classification performance. This method is named as Adaptive Twin Support Vector Machine Learning (A-TSVM). Moreover, we consider a high-level learning paradigm with privilege information (LUPI) to learn a induced model that further constrains the solution in the target space. The learned model is named as domain Adaptive Twin Support Vector Machine Learning Using Privileged Information (A-TSVM+). Finally, a series of comparative experiments with many other methods are performed on three datasets. The experimental results effectively indicate that the proposed method can not only greatly improve the accuracy of classification, but also save computing time.
Published: 2022

49. QoS prediction for smart service management and recommendation based on the location of mobile users

Author: Lei-Lei Shi, John Panneerselvam, Liang Jiang, Rongbo Zhu, and Lu Liu
Subjects: Service (business), Computer science, Cognitive Neuroscience, Quality of service, media_common.quotation_subject, Service management, Recommender system, computer.software_genre, Computer Science Applications, World Wide Web, Artificial Intelligence, Quality (business), Web service, computer, 5G, Mobile service, media_common
Abstract: Quality of Service (QoS) directly reflects the degree to which services offered by providers satisfy the non-functional requirements of users. QoS information is not usually available as a priori to providers when recommending services to user queries, this creates uncertainty in offering right services to right queries. Recent researches in service recommendation and management mainly address the issues of sparse data prediction and user personalized recommendation. Recommendation systems require smart strategies of recommending and managing services in accordance with the user queries. Predicting the QoS requirements of user queries before recommending the services can potentially aid in offering the most suitable services to users. This paper proposes a hybrid mobile service recommendation and management model based on semantic recommendation along with location-based quality preference analysis for emerging 5G mobile networks. The proposed model can effectively predict the QoS by exploiting previously invoked services to identify the best matching mobile services based on the similarity between users and services. Performance evaluation based on a published web services dataset demonstrates an enhanced prediction accuracy with an effective reduction in time overheads when compared to other related methods.
Published: 2022

50. Dynamic video mix-up for cross-domain action recognition

Author: Chunfeng Song, Han Wu, Yanyang Liu, Shaolong Yue, Jun Xiao, and Zhenyu Wang
Subjects: Similarity (geometry), Computer science, Generalization, business.industry, Cognitive Neuroscience, Machine learning, computer.software_genre, Sensor fusion, Class (biology), Computer Science Applications, Domain (software engineering), Action (philosophy), Artificial Intelligence, Action recognition, Learning methods, Artificial intelligence, business, computer
Abstract: In recent years, action recognition has been extensively studied. For some general action datasets, such as UCF101 [1] , the recognition accuracy in a specific domain can reach 95 % . However, due to the existence of the domain-wise discrepancy, the performance of the model will be significantly reduced when deployed to realistic scenes. Therefore, to support the generalization of the action recognition model in practical scenes, the cross-domain problem should be addressed urgently. In this paper, we propose a cross-domain video data fusion mechanism to reduce the difference between domains. Our method is different from existing methods in two points: (1) Instead of performing mix-up at the feature-level, we propose to execute the mix-up directly at the input-level, which introduces more original information beyond the middle features. In addition, a progressive learning method is introduced for adaptive cross-domain fusion. (2) To make full use of the action class knowledge from the source domain, we also propose pseudo-label guided mix-up data learning. Note that only top-ranking confident pseudo labels are selected to ensure the stable similarity between the source and target domains. We evaluate the proposed method on two widely used cross-domain datasets, including the UCF101-HMDB51full and UCF-Olympic. Extensive experimental results have shown that the proposed method is effective and achieves the state-of-the-art performance. In the HMDB51(source domain) → UCF101(target domain) direction, the accuracy of our method can reach 98.60 % , which is 9.54 % improvement over the existing state-of-the-art method.
Published: 2022

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Database

104 results

Search Results

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources