2,516 results
Search Results
2. Advances in computational intelligence: Selected and improved papers of the 12th International Work-Conference on Artificial Neural Networks (IWANN 2013).
- Author
-
Atencia, Miguel, Sandoval, Francisco, and Prieto, Alberto
- Subjects
- *
COMPUTATIONAL intelligence , *CONFERENCES & conventions , *ARTIFICIAL neural networks , *NEURAL computers , *COMPUTER software - Published
- 2015
- Full Text
- View/download PDF
3. A Comprehensive survey on ear recognition: Databases, approaches, comparative analysis, and open challenges.
- Author
-
Benzaoui, Amir, Khaldi, Yacine, Bouaouina, Rafik, Amrouni, Nadia, Alshazly, Hammam, and Ouahabi, Abdeldjalil
- Subjects
- *
ARTIFICIAL neural networks , *EAR , *FEATURE extraction , *DEEP learning , *COMPARATIVE studies - Abstract
Automatic identity recognition from ear images is an active research topic in the biometric community. The ability to secretly acquire images of the ear remotely and the stability of the ear shape over time make this technology a promising alternative for surveillance, authentication, and forensic applications. In recent years, significant research has been conducted in this area. Nevertheless, challenges remain that limit the commercial use of this technology. Several phases of the ear recognition system have been studied in the literature, from ear detection, normalization, and feature extraction to classification. This paper reviews the most recent methods used to describe and classify biometric features of the ear. We propose a first taxonomy to group existing approaches to ear recognition, including 2D, 3D, and combined 2D and 3D methods, as well as an overview of historical advances in this field. It is well known that data and algorithms are the essential components in biometrics, particularly in-ear recognition. However, early ear recognition datasets were very limited and collected in laboratory with controlled environments. With the wider use of deep neural networks, a considerable amount of training data has become necessary if acceptable ear recognition performance is to be achieved. As a consequence, current ear recognition datasets have increased significantly in size. This paper gives an overview of the chronological evolution of ear recognition datasets and compares the performance of conventional vs. deep learning methods on several datasets. We proposed a second taxonomy to classify the existing databases, including 2D, 3D, and video ear datasets. Finally, some open challenges and trends are debated for future research. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
4. Improving proximal policy optimization with alpha divergence.
- Author
-
Xu, Haotian, Yan, Zheng, Xuan, Junyu, Zhang, Guangquan, and Lu, Jie
- Subjects
- *
ARTIFICIAL neural networks , *REINFORCEMENT learning - Abstract
• A linearly combined form of the objective is reformulated to control the trade-off between the return and the divergence more effectively. • An improved proximal policy optimization method (i.e., alphaPPO) is proposed, with a more elaborative alpha divergence for two adjacent policies. • The effectiveness of our alphaPPO is validated using detailed experimental comparison and analysis for six benchmark environments. Proximal policy optimization (PPO) is a recent advancement in reinforcement learning, which is formulated as an unconstrained optimization problem including two terms: accumulative discount return and Kullback–Leibler (KL) divergence. Currently, there are three PPO versions: primary, adaptive, and clipping. The most widely used PPO algorithm is the clipping version, in which the KL divergence is replaced by a clipping function to measure the difference between two policies indirectly. In this paper, we revisit this primary PPO and improve it in two aspects. One is to reformulate it as a linearly combined form to control the trade-off between two terms. The other is to substitute a parametric alpha divergence for KL divergence to measure the difference of two policies more effectively. This novel PPO variant is referred to as alphaPPO in this paper. Experiments on six benchmark environments verify the effectiveness of our alphaPPO, compared with clipping and combined PPOs. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
5. Formal convergence analysis on deterministic [formula omitted]-regularization based mini-batch learning for RBF networks.
- Author
-
Liu, Zhaofeng, Leung, Chi-Sing, and So, Hing Cheung
- Subjects
- *
ITERATIVE learning control , *ARTIFICIAL neural networks , *RADIAL basis functions , *NONLINEAR regression , *SMOOTHNESS of functions , *DETERMINISTIC algorithms - Abstract
Conventional convergence analysis on mini-batch learning is usually based on the stochastic gradient concept, in which we assume that the training data are presented in a random order. Also, some convergence results require that the learning rate should decrease with the number of training cycles, and that the objective function is a smooth function. Practically speaking, a deterministic presentation scheme with a fixed learning rate is more preferable. Hence, there is a gap between theoretical results and actual implementation. This paper aims at filling the gap. We use the radial basis function (RBF) model for nonlinear regression problems as an example to analyze the convergence properties of mini-batch learning. This paper considers a nonsmooth objective function, which consists of three terms. The coexistence of these three terms is able to handle a number of situations. The first term is a conventional training set error. The second term is a quadratic term which is used to suppress the effect of imperfections in the implementation. The last term is an ℓ 1 -norm term which is used to select important RBF nodes for the resultant network. Note that the ℓ 1 -norm term is a nonsmooth function. Although a nonsmooth ℓ 1 -norm is included and the mini-batch algorithm is deterministic, we are still able to derive the convergence properties, including the sufficient conditions for convergence and range of learning rate. With our results, we have a better theoretical understanding on the behaviour of mini-batch learning and obtain some guidelines on choosing the learning rate. The analysis results can be extended to other flat structural neural network models and other objective functions, which are with quadratic terms and ℓ 1 -norm. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
6. FuzzyGAN: Fuzzy generative adversarial networks for regression tasks.
- Author
-
Nguyen, Ryan, Singh, Shubhendu Kumar, and Rai, Rahul
- Subjects
- *
GENERATIVE adversarial networks , *ARTIFICIAL neural networks , *CONVOLUTIONAL neural networks , *DIFFERENTIABLE dynamical systems , *FUZZY logic , *FUZZY systems - Abstract
Generative Adversarial Networks (GANs) are well-known tools for data generation and semi-supervised classification. GANs, with less labeled data, outperform Deep Neural Networks (DNNs) and Convolutional Neural Networks (CNNs) in classification tasks. The success of GANs in classification tasks motivates the development of GAN-based techniques for semi-supervised regression tasks. However, developing GANs for regression introduces two major challenges: (1) inherent instability in the GAN formulation and (2) performing regression and achieving stability simultaneously. This paper introduces techniques that show improvement in the GANs' regression capability. We bake a differentiable fuzzy logic system at multiple locations in a GAN. The fuzzy logic takes the output of either the generator or the discriminator to predict the output, y , and evaluate the generator's performance. We outline the results of applying the fuzzy logic system across multiple GANs and summarize each approach's efficacy. This paper shows that adding a fuzzy logic layer can enhance GAN's ability to perform regression; the most desirable injection location is problem-specific, and we show this through experiments over various datasets. Besides, we demonstrate empirically that the fuzzy-infused GANs are competitive with the DNNs. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
7. TJU-DNN: A trajectory-unified framework for training deep neural networks and its applications.
- Author
-
Lv, Xian-Long, Chiang, Hsiao-Dong, Wang, Bin, and Zhang, Yong-Feng
- Subjects
- *
ARTIFICIAL neural networks , *ELECTRIC lines - Abstract
The training method for deep neural networks mainly adopts the gradient descent (GD) method. These methods, however, are very sensitive to initialization and hyperparameters. In this paper, an enhanced gradient descent method guided by the trajectory-based method for training deep neural networks, termed the Trajectory Unified Framework (TJU) method, is presented. From a theoretical viewpoint, the robustness of the TJU-based method is supported by an analytical basis presented in the paper. From a computational viewpoint, a TJU methodology consisting of a Block-Diagonal-Pseudo-Transient-Continuation method and a gradient descent method, termed the TJU-GD method, for training deep neural networks is added to obtain high-quality results. Furthermore, to resolve the issue of imbalanced classification, a TJU-Focal-GD method is developed and evaluated. Experimental numerical evaluation of the proposed TJU-GD on various public datasets reveals that the proposed method can achieve great improvements over baseline methods. Specifically, the proposed TJU-Focal-GD also possesses several advantages over other methods for a class of imbalanced datasets from the homemade power line inspection dataset (PLID). [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
8. Classification of natural images inspired by the human visual system.
- Author
-
Davoodi, Paria, Ezoji, Mehdi, and Sadeghnejad, Naser
- Subjects
- *
ARTIFICIAL neural networks , *VISUAL perception , *FILTER banks , *RETINA , *VISUAL cortex , *CONVOLUTIONAL neural networks , *INFORMATION modeling - Abstract
In this paper, a three-step model based on the integration of Deep Neural Networks (DNN) and Decision Models is introduced for image classification which is inspired by the human visual system. To make a decision about an object, many actions should be done in a hierarchical process in the brain. First, the retina receives visual stimuli and transfers them to the visual cortex in the brain. The information extracted in the visual cortex, is accumulated over time to select an appropriate response. Many of the current decision-making models do not show how each image is converted into useful information for the decision model. Some models have used neural networks to convert each image into the information needed in the decision-making model; however, the role of the retina is ignored among these models. In this paper, a combination of retina inspired filters, CNN-based description and accumulator-based decision model is used to classify images. This model's structure resembles the human brain due to the usage of the DoG filter bank as retina inspired filter in the first stage of it. This model shows a significant improvement in accuracy in comparison to other models; furthermore, its performance is acceptable even with the small sample training set. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
9. A new hybrid optimizer for stochastic optimization acceleration of deep neural networks: Dynamical system perspective.
- Author
-
Xie, Wenjing, Tang, Weishan, and Kuang, Yujia
- Subjects
- *
ARTIFICIAL neural networks , *DYNAMICAL systems , *HYBRID systems , *SYSTEMS theory - Abstract
Stochastic optimization acceleration is extremely significant and challenging for deep neural networks (DNNs). In recent years, several novel proportional-integral–differential-based (PID-based) optimizers have been proposed to speed up the optimization by alleviating the oscillation behavior of stochastic gradient descent with momentum (SGD-M), yet lacked theoretical analysis. Along this line of research, this paper adopts dynamical system theory to design a new hybrid optimizer and present theoretical analysis. Firstly, it is found that DNN optimization is equivalent to a discrete time dynamical system. Building upon the equivalence, high order augmented dynamical system viewpoint is utilized to design a PI-like optimizer for ensuring high accuracy, which is more stable than SGD-M. Then, hybrid dynamical system viewpoint is employed to improve the PI-like optimizer as a new hybrid form for suppressing oscillation and accelerating optimization. Lyapunov method, Taylor series, matrix theory and equilibrium are combined to theoretically investigate the convergence and the oscillation of loss function, showing that the proposed hybrid optimizer can alleviate oscillation, boost optimization speed, and maintain high accuracy. In theoretical analyses, explicit conditions of hyper-parameters that guarantee training stability are calculated and presented, practically guiding the adjustment of hyper-parameters and promoting the application of hybrid optimizer. Experiments are presented on three commonly used benchmark datasets, i.e., MNIST, CIFAR10 and CIFAR100, demonstrating that the hybrid optimizer obtains up to 42% acceleration with competitive accuracy relative to state-of-the-art optimizers. In short, this paper not only presents a new hybrid optimizer for accelerating optimization, but also provides a novel, theoretical and systematic perspective to find and analyze new optimizer for DNNs. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
10. Person identification from fingernails and knuckles images using deep learning features and the Bray-Curtis similarity measure.
- Author
-
Alghamdi, Mona, Angelov, Plamen, and Alvaro, Lopez Pellicer
- Subjects
- *
FINGERNAILS , *ARTIFICIAL neural networks , *FINGERS , *IMAGE registration , *FEATURE extraction , *AUTOMATIC identification - Abstract
In this paper, an approach that makes use of knuckle creases and fingernails for person identification is presented. It introduces a framework for automatic person identification that includes localisation of the region of interest (ROI) of many components within hand images, recognition and segmentation of the detected components using bounding boxes, and similarity matching between two different sets of segmented images. The following hand components are considered: i) the metacarpophalangeal (MCP) joint, commonly known as the base knuckle; ii) the proximal interphalangeal (PIP) joint, commonly known as the major knuckle; iii) the distal interphalangeal (DIP) joint, commonly known as the minor knuckle; iv) the interphalangeal (IP) joint, commonly known as the thumb knuckle, and v) the fingernails. Crucial elements of the proposed framework are the feature extraction and similarity matching. This paper exploits different deep learning neural networks (DLNNs), which are essential in extracting discriminative high-level abstract features. We further use various similarity measures for the matching process. We validate the proposed approach on well-known benchmarks, including the 11k Hands dataset and the Hong Kong Polytechnic University Contactless Hand Dorsal Images known as PolyU. The results indicate that knuckle patterns and fingernails play a significant role in the person identification framework. The 11K Hands dataset results indicate that the left-hand results are better than the right-hand results and the fingernails produce consistently higher identification results than other hand components, with a rank-1 score of 100 %. In addition, the PolyU dataset attains 100 % in the fingernail of the thumb finger. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
11. A review on the use of deep learning for medical images segmentation.
- Author
-
Aljabri, Manar and AlGhamdi, Manal
- Subjects
- *
COMPUTER-assisted image analysis (Medicine) , *DEEP learning , *ARTIFICIAL neural networks , *IMAGE segmentation , *DIAGNOSTIC imaging , *CONVOLUTIONAL neural networks - Abstract
• An overview of deep learning algorithms used in medical image segmentation is presented. • More than 150 papers applying deep learning to different medical applications are summarised. • Challenges and future directions in medical image segmentation are discussed. Deep learning (DL) algorithms have rapidly become a robust tool for analyzing medical images. They have been used extensively for medical image segmentation as the first and significant components of the diagnosis and treatment pipeline. Medical image segmentation is efficiently addressed by many types of deep neural networks, such as convolutional neural networks, fully convolutional network recurrent networks, adversarial networks, and U-shaped networks. This paper reviews the major DL models and applications pertinent to medical image segmentation and summarizes over 150 contributions to the field. Brief overviews of articles are provided by application area: anatomical structures such as organs, bones, and vessels, and abnormalities such as lesions and calcification. Moreover, we discuss current challenges and suggest directions for future research. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
12. Needle in a Haystack: Spotting and recognising micro-expressions "in the wild".
- Author
-
Gan, Y.S., See, John, Khor, Huai-Qian, Liu, Kun-Hong, and Liong, Sze-Teng
- Subjects
- *
FACIAL expression , *ARTIFICIAL neural networks , *EMOTION recognition , *OPTICAL flow , *POKER , *HUMAN fingerprints - Abstract
Computational research on facial micro-expressions has long focused on videos captured under constrained laboratory conditions due to the challenging elicitation process and limited samples that are publicly available. Moreover, processing micro-expressions is extremely challenging under unconstrained scenarios. This paper introduces, for the first time, a completely automatic micro-expression "spot-and-recognize" framework that is performed on in-the-wild videos, such as in poker games and political interviews. The proposed method first spots the apex frame from a video by handling head movements and unconscious actions which are typically larger in motion intensity, with alignment employed to enforce a canonical face pose. Optical flow guided features play a central role in our method: they can robustly identify the location of the apex frame, and are used to learn a shallow neural network model for emotion classification. Experimental results demonstrate the feasibility of the proposed methodology, establishing good baselines for both spotting and recognition tasks – ASR of 0.33 and F1-score of 0.6758 respectively on the MEVIEW micro-expression database. In addition, we present comprehensive qualitative and quantitative analyses to further show the effectiveness of the proposed framework, with new suggestion for an appropriate evaluation protocol. In a nutshell, this paper provides a new benchmark for apex spotting and emotion recognition in an in-the-wild setting. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
13. Predicting vehicle fuel consumption based on multi-view deep neural network.
- Author
-
Li, Yawen, Zeng, Isabella Yunfei, Niu, Ziheng, Shi, Jiahao, Wang, Ziyang, and Guan, Zeli
- Subjects
- *
ARTIFICIAL neural networks , *AUTOMOTIVE fuel consumption , *ENERGY consumption , *STANDARD deviations - Abstract
The problem of global warming is getting more serious, and vehicle emission is the main cause. In recent years, the number of locomotives in China has been increasing at a rate of more than 20% per year, and the problem of automobile pollution is becoming more serious. The transportation industry is the main source of fossil fuel combustion and environmental pollution. Therefore, in this paper, we propose a multi-view deep neural network (MVDNN) to analyze the key factors affecting the fuel consumption of automobiles. The experiments show that the introduction of human input improves the prediction accuracy and the root mean square error (RMSE) achieves 0.993. In addition, this paper also finds that for drivers, driving habits, driving frequency, and safety awareness are the most important factors affecting the fuel consumption of vehicles by combining Lasso regression with MVDNN. Finally, by comparing the prediction accuracy of different experiments, relevant policy suggestions are made. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
14. Hierarchical graph attention network with pseudo-metapath for skeleton-based action recognition.
- Author
-
Wang, Mingdao, Li, XueMing, Zhang, Xianlin, and Zhang, Yue
- Subjects
- *
COMPUTER vision , *ARTIFICIAL neural networks , *JOINTS (Anatomy) , *GLOBAL method of teaching , *SKELETON - Abstract
Skeleton-based action recognition has gained significant attention in computer vision. Most state-of-the-art (SOTA) approaches view the skeleton as a homogeneous graph. Unlike those approaches, this paper shows that methods in the heterogeneous graph manner can also achieve competitive performance. In this paper, a logical heterogeneous skeleton graph is built under the assumption of the heterogeneity of joints and bones at different positions, and the action recognition task is formulated as message aggregation and prediction on this heterogeneous graph. Specifically, a novel semantic concept named pseudo-metapath is introduced to represent dependencies between joints, based on which a hierarchical graph attention network with the joint-level attention and the semantic-level attention modules is proposed to capture richer skeleton features. The joint-level attention module intends to get the local difference among the joints within each pseudo-metapath, while the semantic-level attention module is capable of learning the global importance of different pseudo-metapaths. Extensive experiments on the NTU-RGB + D 60, NTU-RGB + D 120 and the SYSU datasets, validate that our model can attain comparable performance to the SOTA methods with 15x fewer input frames, 26.3x less FLOPs and 2.8x less parameters. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
15. Elegans-AI: How the connectome of a living organism could model artificial neural networks.
- Author
-
Bardozzo, Francesco, Terlizzi, Andrea, Simoncini, Claudio, Lió, Pietro, and Tagliaferri, Roberto
- Subjects
- *
ARTIFICIAL neural networks , *DEEP learning , *CAENORHABDITIS elegans , *LONG-term memory , *ORGANISMS , *STRUCTURAL optimization , *EDUCATIONAL outcomes - Abstract
This paper introduces Elegans-AI models, a class of neural networks that leverage the connectome topology of the Caenorhabditis elegans to design deep and reservoir architectures. Utilizing deep learning models inspired by the connectome, this paper leverages the evolutionary selection process to consolidate the functional arrangement of biological neurons within their networks. The initial goal involves the conversion of natural connectomes into artificial representations. The second objective centers on embedding the complex circuitry topology of artificial connectomes into both deep learning and deep reservoir networks, highlighting their neural-dynamic short-term and long-term memory and learning capabilities. Lastly, our third objective aims to establish structural explainability by examining the heterophilic/homophilic properties within the connectome and their impact on learning capabilities. In our study, the Elegans-AI models demonstrate superior performance compared to similar models that utilize either randomly rewired artificial connectomes or simulated bio-plausible ones. Notably, these Elegans-AI models achieve a top-1 accuracy of 99.99% on both Cifar10 and Cifar100, and 99.84% on MNIST Unsup. They do this with significantly fewer learning parameters, particularly when reservoir configurations of the connectome are used. Our findings indicate a clear connection between bio-plausible network patterns, the small-world characteristic, and learning outcomes, emphasizing the significant role of evolutionary optimization in shaping the topology of artificial neural networks for improved learning performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. Voice-based age, gender, and language recognition based on ResNet deep model and transfer learning in spectro-temporal domain.
- Author
-
Mavaddati, Samira
- Subjects
- *
DEEP learning , *ARTIFICIAL neural networks , *SOCIAL science research , *RECURRENT neural networks , *CONVOLUTIONAL neural networks , *SIGNAL processing - Abstract
In personal identity recognition systems, detecting a person's age, gender, and language using voice signal characteristics is a crucial issue, especially because of the importance of security considerations. Age, gender, and language classification problems are important in signal processing because they are used to analyze and understand human behavior, interactions, and preferences. This can be especially useful in the fields of human-computer interaction, psychology, and social science research. In this paper, a new system for detecting a speaker's age, gender, and language based on deep learning models is presented. Deep learning models have shown great efficacy in various fields of signal processing. For this paper, a range of deep models were tested, including convolutional neural networks (CNNs), recurrent neural network (RNN), and a fine-tuning ResNet34 architecture. Additionally, techniques such as transfer learning were applied to improve the efficiency of the proposed system. The input voice signals are preprocessed by applying the spectro-temporal transform to obtain additional features that can be fed to the ResNet34 model, which is designed specifically for the task of voice signal processing. The dataset used in this paper was sourced from the Mozilla common voice initiative, which is dedicated to advancing speech recognition and language identification technologies. The performance of the proposed algorithm was evaluated in the presence of Gaussian noise to determine its robustness. The experimental results demonstrated that the proposed algorithm significantly outperformed basic algorithms and other deep neural networks in terms of age and gender recognition from voice signals. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. A lightweight backdoor defense framework based on image inpainting.
- Author
-
Wei, Yier, Gao, Haichang, Wang, Yufei, Gao, Yipeng, and Liu, Huan
- Subjects
- *
ARTIFICIAL neural networks , *INPAINTING , *PAINT - Abstract
Deep neural networks (DNNs) have been shown to be vulnerable to backdoor attacks during training. Most of the existing backdoor defense methods are designed for specific types of backdoor attacks, and the work of detecting backdoors and mitigating backdoors is mostly separate. Currently, few general and complete defense frameworks have been developed. In this paper, we propose a lightweight, general, and complete defense framework against three main types of backdoor attacks. It can efficiently detect poisoned images and remove trigger patterns on poisoned images without costly retraining of the backdoor model. First, we use the feature difference between clean samples and poisoned samples in the middle layer of the model to distinguish them. Then, we remove the backdoor using image inpainting algorithm to remove the backdoor triggering pattern on the poisoned samples. We deploy three of the most popular backdoor attacks on three datasets to test the effectiveness of our defenses. Extensive experimental results show that our method can effectively defend against various backdoor attacks with a relatively small cost. In particular, we reduce the attack success rate of the more stealthy clean-label poisoning attack from 94.9% to 0.02% with little impact on the classification accuracy of the inpainted images. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
18. Adversarial patch attacks against aerial imagery object detectors.
- Author
-
Tang, Guijian, Jiang, Tingsong, Zhou, Weien, Li, Chao, Yao, Wen, and Zhao, Yong
- Subjects
- *
ARTIFICIAL neural networks , *OBJECT recognition (Computer vision) , *AERIAL bombing , *DETECTORS - Abstract
Although Deep Neural Networks (DNNs)-based object detectors are widely used in various fields, especially on aerial imagery object detections, it has been observed that a small elaborately designed patch attached to the images can mislead the DNNs-based detectors into producing erroneous output. However, the target detectors being attacked are quite simple, and the attack efficiency is relatively low in previous works, making it not practicable in real scenarios. To address these limitations, a new adversarial patch attack algorithm is proposed in this paper. Firstly, we designed a novel loss function using the intermediate outputs of the models rather than the model's final outputs interpreted by the detection head to optimize adversarial patches. The experiments conducted on the DOTA, RSOD, and NWPU VHR-10 datasets demonstrate that our method can significantly degrade the performance of the detectors. Secondly, we conducted intensive experiments to investigate the impact of different outputs of the detection model on generating adversarial patches, demonstrating the class score is not as effective as the objectness score. Thirdly, we comprehensively analyzed the attack transferability across different aerial imagery datasets, verifying that the patches generated on one dataset are also effective in attacking another. Moreover, we proposed ensemble training to boost the attack's transferability across models. Our work alarms the application of DNNs-based object detectors in aerial imagery. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
19. Hessian regularization of deep neural networks: A novel approach based on stochastic estimators of Hessian trace.
- Author
-
Liu, Yucong, Yu, Shixing, and Lin, Tong
- Subjects
- *
ARTIFICIAL neural networks , *DYNAMICAL systems , *DATA augmentation , *LYAPUNOV stability - Abstract
[Display omitted] • Connecting Hessian trace with a generalization error bound. • Flat minima of loss landscape and stability analysis in dynamical systems. • Efficient Hessian trace regularization algorithm with Dropout. • Performance comparison on vision and language tasks. In this paper, we develop a novel regularization method for deep neural networks by penalizing the trace of Hessian. This regularizer is motivated by a recent guarantee bound of the generalization error. We explain its benefits in finding flat minima and avoiding Lyapunov stability in dynamical systems. We adopt the Hutchinson method as a classical unbiased estimator for the trace of a matrix and further accelerate its calculation using a Dropout scheme. Experiments demonstrate that our method outperforms existing regularizers and data augmentation methods, such as Jacobian, Confidence Penalty, Label Smoothing, Cutout, and Mixup. The code is available at https://github.com/Dean-lyc/Hessian-Regularization. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
20. Graph over-parameterization: Why the graph helps the training of deep graph convolutional network.
- Author
-
Lin, Yucong, Li, Silu, Xu, Jiaxing, Xu, Jiawei, Huang, Dong, Zheng, Wendi, Cao, Yuan, and Lu, Junwei
- Subjects
- *
ARTIFICIAL neural networks , *CONVOLUTIONAL neural networks , *MATHEMATICAL convolutions , *PARAMETERIZATION - Abstract
Recent studies show that gradient descent can train a deep neural network (DNN) to achieve small training and test errors when the DNN is sufficiently wide. This result applies to various over-parameterized neural network models including fully-connected neural networks and convolutional neural networks. However, existing theory does not apply to graph convolutional networks (GCNs), as GCNs is built according to the topological structures of the data. It has been empirically observed that GCNs can outperform vanilla neural networks when the underlying graph captures geometric information of the data. However, there is few theoretical justification of such observation. In this paper, we establish theoretical guarantees of the high-probability convergence of gradient descent for training over-parameterized GCNs. Specifically, we introduce a novel measurement of the relation between the graph and the data, called the "graph disparity coefficient", and show that the convergence of GCN is faster when the graph disparity coefficient is smaller. Our analysis provides novel insights into how the graph convolution operation in a GCN helps training, and provides useful guidance for GCN training in practice. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
21. Learning rules in spiking neural networks: A survey.
- Author
-
Yi, Zexiang, Lian, Jing, Liu, Qidong, Zhu, Hegui, Liang, Dong, and Liu, Jizhao
- Subjects
- *
ARTIFICIAL neural networks , *PROCESS capability , *SPATIOTEMPORAL processes , *IMAGE recognition (Computer vision) , *SIGNAL processing , *ACTION potentials - Abstract
Spiking neural networks (SNNs) are a promising energy-efficient alternative to artificial neural networks (ANNs) due to their rich dynamics, capability to process spatiotemporal patterns, and low-power consumption. The complex intrinsic properties of SNNs give rise to a diversity of their learning rules which are essential to functional SNNs. This paper is aimed at presenting a comprehensive overview of learning rules in SNNs. Firstly, we introduce the basic concepts of SNNs and commonly used neuromorphic datasets. Then, guided by a hierarchical classification of SNN learning rules, we present a comprehensive survey of these rules with discussions on their characteristics, advantages, limitations, and performance on several datasets. Moreover, we review practical applications of SNNs, including event-based vision and audio signal processing. Finally, we conclude this survey with a discussion on challenges and promising future research directions in this area. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
22. Self-adaptive logit balancing for deep neural network robustness: Defence and detection of adversarial attacks.
- Author
-
Wei, Jiefei, Yao, Luyan, and Meng, Qinggang
- Subjects
- *
ARTIFICIAL neural networks , *PLANT defenses , *LOGITS - Abstract
[Display omitted] With the widespread applications of Deep Neural Networks (DNNs), the safety of DNNs has become a significant issue. The vulnerability of the neural networks against adversarial examples deepens concerns about the safety of DNNs applications. This paper proposed a novel defence method to improve the adversarial robustness of DNN classifiers without using adversarial training. This method introduces two new loss functions. First, a zero-cross-entropy loss is used to punish overconfidence and find the appropriate confidence for different instances. Second, a logit balancing loss is proposed to protect DNNs from non-targeted attacks by regularising incorrect classes' logits distribution. This method achieved competitive adversarial robustness compared to advanced adversarial training methods. Meanwhile, a novel robustness diagram is proposed to analyse, interpret and visualise the robustness of DNN classifiers against adversarial attacks. Furthermore, a Log-Softmax-pattern-based adversarial attack detection method is proposed. This detection method can distinguish clean inputs and multiple adversarial attacks via one multi-classification MLP. In particular, it is state-of-the-art in identifying white-box gradient-based attacks; it achieved at least 95.5% accuracy for classifying four white-box gradient-based attacks with maximum 0.1% false positive ratio. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
23. Federated learning by employing knowledge distillation on edge devices with limited hardware resources.
- Author
-
Tanghatari, Ehsan, Kamal, Mehdi, Afzali-Kusha, Ali, and Pedram, Massoud
- Subjects
- *
ARTIFICIAL neural networks , *KNOWLEDGE transfer , *POWER resources - Abstract
This paper presents a federated learning approach based on utilizing computational resources of the IoT edge devices for training deep neural networks. In this approach, the edge devices and the cloud server collaborate in the training phase while preserving the privacy of the edge device data. Owing to the limited computational power and resources available to the edge devices, instead of the original neural network (NN), we suggest to use a smaller NN generated using a proposed heuristic method. In the proposed approach, the smaller model, which is trained on the edge device, is generated from the main NN model. By the exploiting Knowledge Distillation (K D) approach, the learned knowledge in the server and the edge devices can be exchanged, leading to lower required computation on the server and preserving data privacy of the edge devices. Also, to reduce the knowledge transfer overhead on the communication links between the server and the edge devices, a method for selecting the most valuable data to transfer the knowledge is introduced. The effectiveness of this method is assessed by comparing it to state-of-the-art methods. The results show that the proposed method lowers the communication traffic by up to 250 × and increases the learning accuracy by an average of 8.9 % in the cloud compared to the prior K D -based distributed training approaches in CIFAR-10 dataset. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
24. Pre-stimulus network responses affect information coding in neural variability quenching.
- Author
-
Liu, Weisi and Liu, Xinsheng
- Subjects
- *
NEURAL codes , *ARTIFICIAL neural networks , *NEUROPROSTHESES , *INFORMATION measurement - Abstract
Neural responding variability to the same stimulus typically decreases after a stimulus presented. During neural variability quenching, the pre-stimulus neural activities interact with the post-stimulus neural responses. However, whether these interactions have influences on information coding remains unclear. In this paper, we construct a two-layer k-winner-take-all (k-WTA) spiking network which simulates primary visual cortical neural responses through probabilistic inference. Generating the phenomenon of neural variability quenching, the network could reflect interactions between pre- and post-stimulus neural responses consistent with experimental observations. During neural variability quenching, pre-stimulus neural responding variability and complexity are considered as factors for the post-stimulus neural responses. Simulations to given stimuli are classified with each varying factor, respectively. Neural responding dimensionality measures the capacity of information coding to given stimuli. Over classified simulations, both of two factors could modify interactions between pre- and post-stimulus neural responses, leading to different neural responding dimensionalities. During neural variability quenching, the temporal structure of stimuli performs as another factor which also could modify neural interactions and induce the varying neural responding dimension. Our model provides the possible interpretation to how the pre-stimulus neural responses participate in neural variability quenching and affect the information coding. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
25. Improving NeuCube spiking neural network for EEG-based pattern recognition using transfer learning.
- Author
-
Wu, Xuanyu, Feng, Yixiong, Lou, Shanhe, Zheng, Hao, Hu, Bingtao, Hong, Zhaoxi, and Tan, Jianrong
- Subjects
- *
ARTIFICIAL neural networks , *PATTERN recognition systems , *DISTRIBUTION (Probability theory) , *SUPPORT vector machines , *SEARCH algorithms , *BIOLOGICAL neural networks , *WAKEFULNESS - Abstract
Electroencephalogram (EEG) data are produced in quantity for measuring brain activity in response to external stimuli. With the rapid development of brain-inspired intelligence, spiking neural network (SNN) possesses the potential to handle EEG data by using spiking activity transmitted among spatially located synapses and neurons. As an original and unifying SNN architecture, NeuCube, is developed to model, recognize and understand EEG data. However, the NeuCube still faces some challenges for EEG-based pattern recognition, such as few labeled data and changes of data probability distribution. Hence, this paper proposes a novel method to improve the performance of the NeuCube for EEG-based pattern recognition by transfer learning. In the first place, the covariance matrix alignment of EEG data is implemented for every subject in the Euclidean space, which reduces the probability distribution discrepancy of EEG data between different subjects. Different estimation methods for reference covariance matrix are tested and the optimal one is selected for different subjects. Secondly, spatio-temporal features of EEG data are extracted based on the NeuCube reservoir. Since hyper-parameters of the NeuCube reservoir have a great impact on its spatio-temporal representation, an improved cuckoo search algorithm is proposed to discover the optimal hyper-parameters for obtaining the optimal spatio-temporal features. Last but not least, a weighted transfer support vector machine is proposed to improve the original output classifier of the NeuCube in order to make the model adaptive to the cross-domain variability of EEG data. The proposed method is tested on open dataset 2a from BCI competition IV 2008 and achieves good spatio-temporal pattern recognition results. Furthermore, the neuron connectivity and activation level associated with the process of mental tasks are illustrated. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
26. Meta-path fusion based neural recommendation in heterogeneous information networks.
- Author
-
Tan, Lei, Gong, Daofu, Xu, Jinmao, Li, Zhenyu, and Liu, Fenlin
- Subjects
- *
RECOMMENDER systems , *ARTIFICIAL neural networks , *INFORMATION networks , *DEEP learning - Abstract
As a powerful data modeling tool, Heterogeneous Information Network (HIN) has been successfully used in auxiliary information exploitation to boost recommendation performance. For HIN based recommendation, it is challenging to extract and fuse useful features of user preferences and item attributes under different semantic paths in HINs. Existing methods leverage a pre-defined fusion function to integrate different semantics for recommendation, which cannot characterize the complex nonlinear interactions between users and items. In this paper, we present a general framework named MNRec, short for Meta-path fusion based Neural Recommendation, to extract and fuse user and item embeddings under different meta-paths for recommendation. Under the framework, we propose an instantiation of MNRec with Multi-Layer Perceptron (MLP) structure. It consists of two major steps, i.e., meta-path based heterogeneous network embedding and deep learning based rating prediction. Concretely, appropriate meta-paths are first designed according to domain knowledge. Then the embeddings of users and items are obtained through a meta-path and commuting matrix based heterogeneous network embedding method. Finally, in light of the powerful nonlinear modeling capabilities of deep neural networks, the learned embeddings under different meta-paths are integrated into a two-pathway MLP structure for rating prediction. Experimental results on three real-world datasets demonstrate the superiority and effectiveness of MNRec compared with state-of-the-art baselines in rating prediction. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
27. From implicit to explicit feedback: A deep neural network for modeling sequential behaviours and long-short term preferences of online users.
- Author
-
Tran, Quyen, Tran, Lam, Hai, Linh Chu, Linh, Ngo Van, and Than, Khoat
- Subjects
- *
ARTIFICIAL neural networks , *RECOMMENDER systems , *DEEP learning , *SOURCE code , *PSYCHOLOGICAL feedback - Abstract
In this work, we examine the advantages of using multiple types of behaviours in recommendation systems. Intuitively, each user often takes some implicit actions (e.g., click) before making an explicit decision (e.g., purchase). Previous studies show that implicit and explicit feedback has different roles for a useful recommendation. However, these studies either exploit implicit and explicit behaviours separately or ignore the semantics of sequential interactions between users and items. In addition, we go from the hypothesis that a user's preferences at a time are combinations of long-term and short-term interests. In this paper, we propose some Deep Learning architectures. The first one is Implicit to Explicit (ITE) , to exploit users' interests through the sequence of their actions. The second and third ones are two versions of ITE with Bidirectional Encoder Representations from Transformers based (BERT-based) architecture called BERT-ITE and BERT-ITE-Si , which combine users' long- and short-term preferences without and with side information to enhance users' representations. The experimental results show that our models outperform previous state-of-the-art ones and also demonstrate our views on the effectiveness of exploiting the implicit to explicit order as well as combining long- and short-term preferences in three large-scale datasets. The source code of our paper is available at: https://github.com/tranquyenbk173/BERT_ITE. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
28. Local-global coordination with transformers for referring image segmentation.
- Author
-
Liu, Fang, Kong, Yuqiu, Zhang, Lihe, Feng, Guang, and Yin, Baocai
- Subjects
- *
IMAGE segmentation , *ARTIFICIAL neural networks - Abstract
Referring image segmentation has sprung up benefiting from the outstanding performance of deep neural networks. However, most existing methods explore either local details or the global context of the scene without sufficiently modelling the coordination between them, leading to sub-optimal results. In this paper, we propose a transformer-based method to enforce the in-depth coordination between short- and long-range dependencies in both explicit and implicit fusion processes. Specifically, we design a Cross Modality Transformer (CMT) module with two successive blocks for explicitly integrating linguistic and visual features, which first locates the related visual region in a global view before concentrating on local patterns. Besides, a Hybrid Transformer Architecture (HTA) is utilized as a feature extractor in the encoding stage to capture global relationships and retain local cues. It can further aggregate the multi-modal features in an implicit manner. In the decoding stage, a Cross-level Information Integration module (CI2) is developed to gather information from adjacent levels by dual top-down paths, including a guided filtration path and a residual reservation path. Experimental results show that the proposed method outperforms the state-of-the-art methods on four RIS benchmarks. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
29. Stability of quaternion-valued neutral-type neural networks with leakage delay and proportional delays.
- Author
-
Song, Qiankun, Yang, Linji, Liu, Yurong, and Alsaadi, Fuad E.
- Subjects
- *
ARTIFICIAL neural networks , *LINEAR matrix inequalities , *STABILITY criterion , *LYAPUNOV stability , *LEAKAGE - Abstract
This paper is concerned with the stability issue of quaternion-valued neural networks with neutral delay, proportional delay and leakage delay. Taking use of the principle of homeomorphism, techniques of matrix inequality and Lyapunov stability theory, a main stability criterion is derived in the form of quaternion-valued linear matrix inequality for ensuring the unique existence and global stability of the equilibrium point for the considered quaternion-valued neural networks. An illustrative example and its simulations are given to show the effectiveness of the theoretical result. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
30. The context effect for blind image quality assessment.
- Author
-
Liang, Zehong, Lu, Wen, Zheng, Yong, He, Weiquan, and Yang, Jiachen
- Subjects
- *
ARTIFICIAL neural networks , *FEATURE extraction , *PIXELS , *HUMAN ecology - Abstract
Image quality assessment (IQA) is a process of visuo-cognitive, which is an essential stage in human interaction with the environment. The study of the context effect (Brown and Daniel, 1987) also shows that the evaluation results made by the human vision system (HVS) is related to the contrast between the distorted image and the background environment. However, the existing IQA methods carry out the quality evaluation that only depends on the distorted image itself and ignores the impact of environment to human perception. In this paper, we propose a novel blind image quality assessment(BIQA) based on the context effect. At first, we use a graphical model to describe how the context effect influences human perception of image quality. Based on the established graph, we construct the context relation between the distorted image and the background environment by the X. Han et al. (2015). Then the context features are extracted from the constructed relation, and the quality-related features are extracted by the fine-tuned neural network from the distorted image in pixel-wise. Finally, these features are concatenated to quantify image quality degradations and then regress to quality scores. In addition, the proposed method is adaptive to various deep neural networks. Experimental results show that the proposed method not only has the state-of-art performance on the synthetic distorted images, but also has a great improvement on the authentic distorted images. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
31. Deep neural networks compression: A comparative survey and choice recommendations.
- Author
-
Marinó, Giosué Cataldo, Petrini, Alessandro, Malchiodi, Dario, and Frasca, Marco
- Subjects
- *
ARTIFICIAL neural networks , *CONVOLUTIONAL neural networks , *DEEP learning , *LOSSY data compression , *HUFFMAN codes , *SCIENTIFIC community , *DATA compression - Abstract
The state-of-the-art performance for several real-world problems is currently reached by deep and, in particular, convolutional neural networks (CNN). Such learning models exploit recent results in the field of deep learning, leading to highly performing, yet very large neural networks with typically millions to billions of parameters. As a result, such models are often redundant and excessively oversized, with a detrimental effect on the environment in terms of unnecessary energy consumption and a limitation to their deployment on low-resource devices. The necessity for compression techniques able to reduce the number of model parameters and their resource demand is thereby increasingly felt by the research community. In this paper we propose the first extensive comparison, to the best of our knowledge, of the main lossy and structure-preserving approaches to compress pre-trained CNNs, applicable in principle to any existing model. Our study is intended to provide a first and preliminary guidance to choose the most suitable compression technique when there is the need to reduce the occupancy of pre-trained models. Both convolutional and fully-connected layers are included in the analysis. Our experiments involved two pre-trained state-of-the-art CNNs (proposed to solve classification or regression problems) and five benchmarks, and gave rise to important insights about the applicability and performance of such techniques w.r.t. the type of layer to be compressed and the category of problem tackled. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
32. Feature pyramid network with multi-scale prediction fusion for real-time semantic segmentation.
- Author
-
Quyen, Toan Van and Kim, Min Young
- Subjects
- *
FORECASTING , *PIXELS , *PROBABILITY theory , *ARTIFICIAL neural networks - Abstract
Feature pyramid network (FPN) is constructed from a bottom-up pathway and a top-down pathway. The method involves multi-scale features, so it can obtain rich contextual information from lower scales and high resolution from the largest scale. Additionally, different receptive fields are effective to capture both thin and large objects in image scenes. All feature maps concatenate together to predict the targets. However, the average pooling method yields the problem of combining the best predictions with poorer ones. In this paper, we proposed a dual prediction to leverage the useful characteristics of each FPN feature map. A low scale prediction attains good precision for large objects. The other one suitably segments narrow objects. Finally, a multi-scale fusion is deployed with an attention part. The attention module finds pixels of a low scale having high probabilities of wrong labels, and then requires the supplements from a high scale. A multi-scale fusion allows the network to learn across the different scales of predictions. We have achieved good Results 77.9% mIoU at 62 FPS on Cityscapes and 44.1% mIoU on Mapillary Vistas. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
33. Command filtered-based neuro-adaptive robust finite-time trajectory tracking control of autonomous underwater vehicles under stochastic perturbations.
- Author
-
Sedghi, Fatemeh, Mehdi Arefi, Mohammad, and Abooee, Ali
- Subjects
- *
AUTONOMOUS underwater vehicles , *ARTIFICIAL neural networks , *ROBUST control , *LYAPUNOV stability , *CLOSED loop systems , *DEGREES of freedom - Abstract
In this paper, the problem of finite-time trajectory tracking control is studied and addressed for a 6 degree of freedom (DOF) autonomous underwater vehicle (AUV) subjected to unknown dynamic model, stochastic perturbations, external disturbances (matched and mismatched) and saturation input nonlinearities. Based on the backstepping control approach, novel finite-time control inputs are designed and proposed. Artificial neural networks (ANNs) and finite-time adaptation laws are exploited to approximate the nonlinear dynamics of AUV, the stochastic perturbations and the upper bound of external disturbances. To handle the destructive effects of saturation input nonlinearities, finite-time auxiliary system method is utilized. To overcome the explosion of complexity problem of backstepping control strategy, compensator-based finite-time command filter approach is exploited. By utilizing the Lyapunov stability theorem, it is mathematically proven and demonstrated that the suggested nonlinear control inputs are able to guarantee the semi-global finite-time stability in probability (SGFSP) of the closed-loop AUV system. Finally, numerical simulations are carried out to illustrate and depict the effectiveness and performance of the proposed neuro-adaptive robust finite-time control scheme. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
34. Power normalized cepstral robust features of deep neural networks in a cloud computing data privacy protection scheme.
- Author
-
Li, Mianjie, Tian, Zhihong, Du, Xiaojiang, Yuan, Xiaochen, Shan, Chun, and Guizani, Mohsen
- Subjects
- *
ARTIFICIAL neural networks , *DEEP learning , *DATA protection , *CLOUD computing , *WAVELET transforms , *THERAPEUTICS - Abstract
Deep Neural Networks (DNNs) have developed rapidly in data privacy protection applications such as medical treatment and finance. However, DNNs require high-speed and high-memory computers in terms of computation, otherwise training can be very lengthy. Furthermore, DNNs are often not available in resource-constrained mobile devices. Therefore, training and executing DNNs are increasingly using cloud computing. In the paper, the Power Normalized Cepstrum-based Robust Feature Detector (PNC-RFD), with deep learning in the cloud computing, is proposed for data privacy protection. The proposed PNC-RFD extracts a specified number of signal segments of high robustness used to embed and extract various data. For the sake of embedding and extracting the data, a method of information hiding employing Dual-Tree Complex Wavelet Packet Transform (DT CWPT) is therefore presented. The presented scheme simultaneously embeds multiple data into coefficients of the DT CWPT of signal segments. By embedding the data into the orthogonal spaces, the proposed method ensures the independent extraction of the multiple data. In line with the performance analysis, the superiority of the presented scheme is elaborated through making the comparison with the current state-of-the-art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
35. Refined probability distribution module for fine-grained visual categorization.
- Author
-
Zhao, Peipei, Miao, Qiguang, Li, Hongsheng, Liu, Ruyi, Quan, Yining, and Song, Jianfeng
- Subjects
- *
ARTIFICIAL neural networks , *CONVOLUTIONAL neural networks , *COMPUTER vision , *RANDOM walks - Abstract
Fine-grained visual categorization is an important task in computer vision. Prior works on fine-grained visual categorization have paid much attention to addressing intra-class variation and inter-class similarity. However, they rarely study that task from the perspective of probability distribution. In this paper, we propose a novel refined probability distribution module based on deep convolutional neural network. Our module computes the probability of an image by fully utilizing the similarity information between images. Firstly, we use deep neural networks to obtain the initial probability distribution and extract features. Then, we build a network whose inputs are features for calculating image-to-image similarity scores. Finally, our module refines the initial probability distribution based on an effective batch random walk operation with similarity scores. Our module can be plugged into many deep convolutional neural networks. Experimental results show that our approach outperforms state-of-the-art methods on the CUB-200–2011, FGVC-Aircraft and Stanford Cars datasets respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
36. [formula omitted]: A python library for time series spatio-temporal feature extraction and prediction using deep learning.
- Author
-
Aguilera-Martos, Ignacio, García-Vico, Ángel M., Luengo, Julián, Damas, Sergio, Melero, Francisco J., Valle-Alonso, José Javier, and Herrera, Francisco
- Subjects
- *
ARTIFICIAL neural networks , *FEATURE extraction , *TIME series analysis , *PYTHON programming language , *DEEP learning , *RECURRENT neural networks , *CONVOLUTIONAL neural networks - Abstract
The combination of convolutional and recurrent neural networks is a promising framework. This arrangement allows the extraction of high-quality spatio-temporal features together with their temporal dependencies. This fact is key for time series prediction problems such as forecasting, classification or anomaly detection, amongst others. In this paper, the TSFE DL library is introduced. It compiles 22 state-of-the-art methods for both time series feature extraction and prediction, employing convolutional and recurrent deep neural networks for its use in several data mining tasks. The library is built upon a set of Tensorflow+Keras and PyTorch modules under the AGPLv3 license. The performance validation of the architectures included in this proposal confirms the usefulness of this Python package. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
37. Grassmannian learning mutual subspace method for image set recognition.
- Author
-
Souza, Lincon S., Sogi, Naoya, Gatto, Bernardo B., Kobayashi, Takumi, and Fukui, Kazuhiro
- Subjects
- *
EMOTION recognition , *RECOGNITION (Psychology) , *CONVOLUTIONAL neural networks , *ARTIFICIAL neural networks , *GRASSMANN manifolds , *OBJECT recognition (Computer vision) , *IMAGE recognition (Computer vision) - Abstract
• New subspace-based method for image set recognition. • Theoretically-grounded integration of subspace representation into DNNs while keeping end-to-end trainability. • Our new method generalizes the classic learning subspace method. This paper addresses the problem of object recognition given a set of images as input (e.g., multiple camera sources and video frames). Convolutional neural network (CNN)-based frameworks do not exploit these sets effectively, processing a pattern as observed, not capturing the underlying feature distribution as it does not consider the variance of images in the set. To address this issue, we propose the Grassmannian learning mutual subspace method (G-LMSM), a NN layer embedded on top of CNNs that can process image sets more effectively and can be trained in an end-to-end manner. The image set is first represented by a low-dimensional input subspace and then this input subspace is matched with dictionary subspaces by a similarity of their canonical angles, an interpretable and easy to compute metric. The key idea of G-LMSM is that the dictionary subspaces are learned as points on the Grassmann manifold, optimized with Riemannian stochastic gradient descent. This learning is stable, efficient and theoretically well-grounded. We demonstrate the effectiveness of our proposed method on hand shape recognition, face identification, and facial emotion recognition. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
38. Model tree methods for explaining deep reinforcement learning agents in real-time robotic applications.
- Author
-
Gjærum, Vilde B., Strümke, Inga, Løver, Jakob, Miller, Timothy, and Lekkas, Anastasios M.
- Subjects
- *
ARTIFICIAL neural networks , *REINFORCEMENT learning , *ROBOTICS , *ARTIFICIAL intelligence - Abstract
Deep reinforcement learning has shown useful in the field of robotics but the black-box nature of deep neural networks impedes the applicability of deep reinforcement learning agents for real-world tasks. This is addressed in the field of explainable artificial intelligence, by developing explanation methods that aim to explain such agents to humans. Model trees as surrogate models have proven useful for producing explanations for black-box models used in real-world robotic applications, in particular, due to their capability of providing explanations in real time. In this paper, we provide an overview and analysis of available methods for building model trees for explaining deep reinforcement learning agents solving robotics tasks. We find that multiple outputs are important for the model to be able to grasp the dependencies of coupled output features, i.e. actions. Additionally, our results indicate that introducing domain knowledge via a hierarchy among the input features during the building process results in higher accuracies and a faster building process. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
39. Deep NMF topic modeling.
- Author
-
Wang, Jianyu and Zhang, Xiao-Lei
- Subjects
- *
ARTIFICIAL neural networks , *DEEP learning , *MATRIX decomposition , *NONNEGATIVE matrices , *COMPUTATIONAL complexity - Abstract
Nonnegative matrix factorization (NMF) based topic modeling methods do not rely on model- or data-assumptions much. However, they are usually formulated as difficult optimization problems, which may suffer from bad local minima and high computational complexity. In this paper, we propose a deep NMF (DNMF) topic modeling framework to alleviate the aforementioned problems. It first applies an unsupervised deep learning method to learn latent hierarchical structures of documents, under the assumption that if we could learn a good representation of documents by, e.g. a deep model, then the topic word discovery problem can be boosted. Then, it takes the output of the deep model to constrain a topic-document distribution for the discovery of the discriminant topic words, which not only improves the efficacy but also reduces the computational complexity over conventional unsupervised NMF methods. We constrain the topic-document distribution in three ways, which takes the advantages of the three major sub-categories of NMF—basic NMF, structured NMF, and constrained NMF respectively. To overcome the weaknesses of deep neural networks in unsupervised topic modeling, we adopt a non-neural-network deep model—multilayer bootstrap network. To our knowledge, this is the first time that a deep NMF model is used for unsupervised topic modeling. We have compared the proposed method with a number of representative references covering major branches of topic modeling on a variety of real-world text corpora. Experimental results illustrate the effectiveness of the proposed method under various evaluation metrics. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
40. Self-supervised anomaly pattern detection for large scale industrial data.
- Author
-
Tang, Xiaoyue, Zeng, Shan, Yu, Fang, Yu, Wei, Sheng, Zhongyin, and Kang, Zhen
- Subjects
- *
ANOMALY detection (Computer security) , *ARTIFICIAL neural networks , *AUTOMATIC speech recognition , *PATTERN recognition systems , *INDUSTRY 4.0 , *TIME series analysis , *MACHINE learning - Abstract
Detecting the anomalies in a large amounts of high-dimensional data has been a challenging task. In the Industry 4.0 environment, large-scale high-dimensional monitoring data features the complex pattern of high level semantics. In order to provide enterprise-wide monitoring solutions, it is necessary to identify the high-level semantic patterns of the anomalies in these data without splitting them. Existing end-to-end deep neural networks for time series are capable of recognizing the high-level semantics in natural language or speech signals, but they are barely applied in real-time anomaly detection of industrial data because of the large time costs. In this paper, we leverage the self-supervised contrastive learning methodology and propose a Composite Semantic Augmentation Encoder (CSAE) to provide an appropriate representation of industrial data and implement quick detection of anomalies in industrial application environments. CSAE is a non-sequential deep neural network with two augmentation layers and a mandatory layer. The two layers of data-augmentation are built to expand the size of samples of both low-level semantic anomalies and high-level semantic anomalies, which enables CSAE to discover diverse anomalies and improves its accuracy of high-level semantic pattern recognition. The mandatory layer is built to compress and reserve the temporal information in the industrial data to accelerate the anomaly detection. Therefore, as a non-sequential contrastive learning model, CSAE has faster training convergence than the usual sequence models. The experiment results have verified that CSAE can achieve higher prediction accuracy with less time consumption than existing machine learning models in the tasks of high dimensional anomaly pattern detection. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
41. Attention Round for post-training quantization.
- Author
-
Diao, Huabin, Li, Gongyan, Xu, Shaoyun, Kong, Chao, and Wang, Wei
- Subjects
- *
CONVOLUTIONAL neural networks , *ARTIFICIAL neural networks , *COMBINATORIAL optimization , *GAUSSIAN function , *KHAT - Abstract
Quantization methods for convolutional neural network models can be broadly categorized into post-training quantization (PTQ) and quantization aware training (QAT). While PTQ offers the advantage of requiring only a small portion of the data for quantization, the resulting quantized model may not be as effective as QAT. To address this limitation, this paper proposes a novel quantization function named Attention Round. Unlike traditional quantization function that map 32 bit floating-point value w to nearby quantization levels, Attention Round allows w to be mapped to all possible quantization levels in the entire quantization space, expanding the quantization optimization space. The possibilities of mapping w to different quantization levels are inversely correlated with the distance between w and the quantization levels, regulated by a Gaussian decay function. Furthermore, to tackle the challenge of mixed precision quantization, this paper introduces a lossy coding length measure to assign quantization precision to different layers of the model, eliminating the need for solving a combinatorial optimization problem. Experimental evaluations on various models demonstrate the effectiveness of the proposed method. Notably, for ResNet18 and MobileNetV2, the PTQ approach achieves comparable quantization performance to QAT while utilizing only 1024 training data and 10 min for the quantization process. • Attention Round quantization function expands the quantization optimization space. • Mixed precision allocation method improves mixed precision quantization efficiency. • Enriched lightweight CNNs contribute to applications in resource-limited scenarios. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
42. Special issue: Advances in artificial neural networks, machine learning and computational intelligenceSelected papers from the 23rd European Symposium on Artificial Neural Networks (ESANN 2015).
- Author
-
Aiolli, Fabio, Bunte, Kerstin, Hérault, Romain, and Kanevski, Mikhail
- Subjects
- *
ARTIFICIAL neural networks , *MACHINE learning , *COMPUTATIONAL intelligence , *CONFERENCES & conventions , *ARTIFICIAL intelligence - Published
- 2016
- Full Text
- View/download PDF
43. Neural architectures for aggregating sequence labels from multiple annotators.
- Author
-
Li, Maolin and Ananiadou, Sophia
- Subjects
- *
ARTIFICIAL neural networks , *MARKOV processes , *VIRTUAL networks , *NATURAL language processing - Abstract
• Explore two neural network-based methods to aggregate noisy sequence labels. • Provide a detailed comparison of different methods: probabilistic graphical models, neural networks and methods that combine graphical and neural network models. • Improve true label prediction performance on three real-world datasets from different domains and tasks. • Present a simple but effective method to learn meaningful annotator embeddings. • Demonstrate our annotator embedding is helpful for studying annotators' reliability and behaviour. Labelled data for training sequence labelling models can be collected from multiple annotators or workers in crowdsourcing. However, these labels could be noisy because of the varying expertise and reliability of annotators. In order to ensure high quality of data, it is crucial to infer the correct labels by aggregating noisy labels. Although label aggregation is a well-studied topic, only a number of studies have investigated how to aggregate sequence labels. Recently, neural network models have attracted research attention for this task. In this paper, we explore two neural network-based methods. The first method combines Hidden Markov Models with networks while also learning distributed representations of annotators (i.e., annotator embedding); the second method combines BiLSTM with autoencoders. The experimental results on three real-world datasets demonstrate the effectiveness of using neural networks for sequence label aggregation. Moreover, our analysis shows that annotators' embeddings not only make our model applicable to real-time applications, but also useful for studying the behaviour of annotators. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
44. Adversarial attacks and defenses in deep learning for image recognition: A survey.
- Author
-
Wang, Jia, Wang, Chengyu, Lin, Qiuzhen, Luo, Chengwen, Wu, Chao, and Li, Jianqiang
- Subjects
- *
ARTIFICIAL neural networks , *DEEP learning , *IMAGE recognition (Computer vision) - Abstract
• Introducing the concepts of adversarial examples and adversarial learning. • Analysis and summary of classical adversarial attack and defense methods. • Extensive taxonomy summarizing recent advances in adversarial attacks and defenses. • Discussions and analyses of prominent issues in the field of adversarial learning. In recent years, researches on adversarial attacks and defense mechanisms have obtained much attention. It's observed that adversarial examples crafted with small malicious perturbations would mislead the deep neural network (DNN) model to output wrong prediction results. These small perturbations are imperceptible to humans. The existence of adversarial examples poses great threat to the robustness of DNN-based models. It is necessary to study the principles behind it and develop their countermeasures. This paper surveys and summarizes the recent advances in attack and defense methods extensively and in detail, analyzes and compares the pros and cons of various attack and defense schemes. Finally we discuss the main challenges and future research directions in this field. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
45. Delving deep into pixelized face recovery and defense.
- Author
-
Zhong, Zhixuan, Du, Yong, Zhou, Yang, Cao, Jiangzhong, and He, Shengfeng
- Subjects
- *
ARTIFICIAL neural networks , *VISUAL perception , *INPAINTING - Abstract
Pixelization is arguably one of the most well-adopted deterministic obfuscation techniques for privacy preservation purposes. Although the recovery of pixelized faces is underexplored, the powerful deep neural networks might combat this problem in a data-driven manner. As a consequence, an unbreakable pixelization approach is desired. To achieve this goal, in this paper, we delve into two contradictory problems of unrecoverable pixelization and its counterpart, depixelization, by leveraging the best recovery to strengthen the robustness of the unrecoverable pixelized patterns. In particular, on the offensive end of recovery, we combat the large and continuous nature of pixelized regions by proposing two strategies, 1) an iterative depixelization network that progressively decomposes and predicts the pixelized regions and thus outer results are used to support inner inferences; 2) a dynamic dilated convolution operation is proposed to stride over the redundant identical pixels from the same pixelized region, enabling the network to adaptively extract valid feature representations. We show that our tailored depixelization method significantly outperforms several baselines or inpainting approaches by over 1.0 FID and 2% ID-SIM improvements on CelebA dataset which includes 182,732 human face images, and therefore we study how to defend this advanced recovery and produce unrecoverable pixelized patterns. To balance the visual perception and robustness of pixelization, we propose to generate two types of adversarial examples, pixel-wise and block-wise perturbations, which make different trade-offs between quality and robustness. By deploying our depixelization network in a semi-whitebox setting, our pixelization method can generate imperceptible perturbations while being robust to depixelization. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
46. Explaining deep neural networks: A survey on the global interpretation methods.
- Author
-
Saleem, Rabia, Yuan, Bo, Kurugollu, Fatih, Anjum, Ashiq, and Liu, Lu
- Subjects
- *
ARTIFICIAL neural networks , *ARTIFICIAL intelligence , *TRUST - Abstract
A substantial amount of research has been carried out in Explainable Artificial Intelligence (XAI) models, especially in those which explain the deep architectures of neural networks. A number of XAI approaches have been proposed to achieve trust in Artificial Intelligence (AI) models as well as provide explainability of specific decisions made within these models. Among these approaches, global interpretation methods have emerged as the prominent methods of explainability because they have the strength to explain every feature and the structure of the model. This survey attempts to provide a comprehensive review of global interpretation methods that completely explain the behaviour of the AI models. We present a taxonomy of the available global interpretations models and systematically highlight the critical features and algorithms that differentiate them from local as well as hybrid models of explainability. Through examples and case studies from the literature, we evaluate the strengths and weaknesses of the global interpretation models and assess challenges when these methods are put into practice. We conclude the paper by providing the future directions of research in how the existing challenges in global interpretation methods could be addressed and what values and opportunities could be realized by the resolution of these challenges. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
47. Neural network-based event-triggered integral reinforcement learning for constrained [formula omitted] tracking control with experience replay.
- Author
-
Xue, Shan, Luo, Biao, Liu, Derong, and Gao, Ying
- Subjects
- *
INTEGRALS , *DATA transmission systems , *ARTIFICIAL neural networks , *DYNAMIC programming - Abstract
Since input constraints and external disturbances are unavoidable in tracking control problems, how to obtain a controller in this case to save communication and data resources at the same time is very challenging. Aiming at these challenges, this paper develops a novel neural network (NN)-based event-triggered integral reinforcement learning (IRL) algorithm for constrained H ∞ tracking control problems. First, the constrained H ∞ tracking control problem is transformed into a regulation problem. Second, an event-triggered optimal controller is designed to reduce network transmission burden and improve resource utilization, where a novel threshold is proposed and its non-negativity can be guaranteed. Third, for implementation purpose, a novel NN-based event-triggered IRL algorithm is developed. In order to improve data utilization, the experience replay technique with an easy-to-verify condition is employed in the learning process. Theoretical analysis proves that the tracking error and weight estimation error are uniformly ultimately bounded. Finally, simulation verification shows the effectiveness of the present method. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
48. Subdomain contraction in deep networks for robust representation learning.
- Author
-
Qi, Yu, Pan, Zhentao, Pan, Gang, and Wang, Yueming
- Subjects
- *
ARTIFICIAL neural networks , *DEEP learning - Abstract
Deep neural networks provide end-to-end tools to learn effective representations from data directly. The deep structure makes it possible to model a complicated pattern, even if it has a variety of changes. This leads to a problem that noises and outliers are usually treated as a specific pattern, which is also learned in the network. It is one reason for overfitting incapable of being addressed sufficiently in deep networks. This paper proposes a new method called subdomain contraction (SDC) to tackle the problem. The idea is that our approach inclines to learn more about the shared features between the subsets of the samples but less about the specific features found in only one or two subsets. To this end, the SDC loss penalizes the distribution distance between sub-domains in the feature space to constrain the sharing level of features. By applying the SDC loss term, the data drive the learning process to an optimal tradeoff between modeling noises and the varieties of the pattern. In this manner, the SDC models the pattern as much as possible and ignores most noises, thus improving the generalization ability. The SDC loss can be efficiently computed in minibatches and can also work collaboratively with other regularization methods such as dropout to further improve the performance. Extensive experiments demonstrate that SDC can improve the effectiveness and robustness of representation learning in deep networks against noises, and the superiority is most remarkable with noisy data. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
49. Global Mittag–Leffler stability and synchronization of discrete-time fractional-order delayed quaternion-valued neural networks.
- Author
-
Chen, Shenglong, Li, Hong-Li, Bao, Haibo, Zhang, Long, Jiang, Haijun, and Li, Zhiming
- Subjects
- *
ARTIFICIAL neural networks , *SYNCHRONIZATION , *LYAPUNOV functions , *PSYCHOLOGICAL feedback - Abstract
This paper is devoted to investigating discrete-time fractional-order delayed quaternion-valued neural networks (DFDQNNs) by utilizing direct quaternion approach. Firstly, a novel lemma and its two corresponding corollaries have been proposed for estimating nabla fractional difference of the quaternion-valued Lyapunov function. Then, the existence and uniqueness of equilibrium point for DFDQNNs is proved by constructing a new quaternion-valued contraction mapping. In addition, by means of our designed Lyapunov functions and the effective feedback controller as well as neoteric nabla difference inequalities, some sufficient criteria have been obtained to ensure the global Mittag–Leffler stability and Mittag–Leffler synchronization of DFDQNNs, respectively. Finally, some numerical examples are provided to verify the yielded results. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
50. Deep multi-view graph-based network for citywide ride-hailing demand prediction.
- Author
-
Jin, Guangyin, Xi, Zhexu, Sha, Hengyu, Feng, Yanghe, and Huang, Jincai
- Subjects
- *
DEMAND forecasting , *RECURRENT neural networks , *CONVOLUTIONAL neural networks , *INTELLIGENT transportation systems , *ARTIFICIAL neural networks , *DEEP learning - Abstract
Urban ride-hailing demand prediction is a crucial but challenging task for intelligent transportation system construction. Predictable ride-hailing demand can facilitate more reasonable vehicle scheduling and online car-hailing platform dispatch. Conventional deep learning methods with no external structured data can be accomplished via hybrid models of Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) by meshing plentiful pixel-level labeled data, but data sparsity of high grid granularity in spatial perspective and limited learning capabilities of long-term dependencies in temporal perspective are still two striking bottlenecks. To address these problems, we propose a novel virtual graph modeling approach to focus on significant demand regions and a novel Deep Multi-View Spatio-temporal Virtual Graph Neural Network (DMVST-VGNN) to strengthen the learning capabilities of spatial dynamics and long-term temporal dependencies. Specifically, DMVST-VGNN integrates structures of 1D CNN, Multi-Graph Attention Neural Network and Transformer Network, which correspond to short-term temporal dynamics view, spatial dynamics view and long-term temporal dynamics view respectively. In this paper, multiple experiments are conducted on two large-scale New York City datasets in higher granularity prediction scenes. And the experimental results demonstrate the effectiveness of DMVST-VGNN framework in ride-hailing demand prediction, no matter in spatial scale or the temporal scale. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.