5,546 results on '"Learning Systems"'
Search Results
152. Object Recognition Using Neural Networks for Robotics Precision Application
- Author
-
Celenta, Giampiero, Guida, Domenico, Chaari, Fakher, Series Editor, Haddar, Mohamed, Series Editor, Kwon, Young W., Series Editor, Gherardini, Francesco, Series Editor, Ivanov, Vitalii, Series Editor, Trojanowska, Justyna, editor, Pavlenko, Ivan, editor, Zajac, Jozef, editor, and Peraković, Dragan, editor
- Published
- 2020
- Full Text
- View/download PDF
153. Ordering Theory
- Author
-
Parke, William C. and Parke, William C.
- Published
- 2020
- Full Text
- View/download PDF
154. BTWalk: Branching Tree Random Walk for Multi-Order Structured Network Embedding.
- Author
-
Xiong, Hao and Yan, Junchi
- Subjects
- *
TREE branches , *RANDOM walks , *LEARNING strategies , *SAMPLING (Process) , *TASK analysis - Abstract
Multi-order proximity is useful for effective network embedding. In contrast to many previous works that only consider order-level weights, this paper proposes to explore a more expressive node-level weighting mechanism to encode the diverse local structure, with a scalable and theoretically justified sampling strategy for its learning. Specifically, we start with a formal definition of multi-order proximity matrix which leads to our new multi-order objective based on Laplacian Eigenmaps and Skip-Gram. Then we instantiate the node-specific multi-order weights in the objective with the help of neighborhood size estimation, which indicates node-specific multi-order information. For objective learning, it is implicitly fulfilled with our proposed branching tree-like random walk strategy termed by BTWalk, which differs from the dominant chain-like walk in existing sampling techniques. BTWalk is designed by a synergetic combination of BFS (breadth-first search) and DFS (depth-first search), which is modulated according to the weights of the considered proximity orders. We theoretically analyze its cost-efficiency, and further propose the so-called Vec4Cross framework that incorporates joint node embedding and network alignment for two partially overlapped networks based on the seed matchings, whereby BTWalk is also adopted for embedding. Promising experimental results are obtained on real-world datasets across popular tasks. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
155. Reinforcement Online Active Learning Ensemble for Drifting Imbalanced Data Streams.
- Author
-
Zhang, Hang, Liu, Weike, and Liu, Qingbao
- Subjects
- *
ACTIVE learning , *ONLINE education , *CLASSIFICATION algorithms , *RECEIVER operating characteristic curves , *HEURISTIC algorithms - Abstract
Applications challenged by the joint problem of concept drift and class imbalance are attracting increasing research interest. This paper proposes a novel Reinforcement Online Active Learning Ensemble for Drifting Imbalanced data stream (ROALE-DI). The ensemble classifier has a long-term stable classifier and a dynamic classifier group which applies a reinforcement mechanism to increase the weight of the dynamic classifiers, which perform better on the minority class, and decreases the weight of the opposite. When the data stream is class imbalanced, the classifiers will lack the training samples of the minority class. To supply training samples, when creating a new classifier, the labeled instances buffer is used to provide instances of the minority class. Then, a hybrid labeling strategy that combines the uncertainty strategy and imbalance strategy is proposed to define whether to obtain the real label of an instance. An experimental evaluation compares the classification performance of the proposed method with semi-supervised and supervised algorithms on both real-world and synthetic data streams. The results show that the ROALE-DI achieves higher Area Under the ROC Curve (AUC) and accuracy values with even fewer real labels, and the labeling cost dynamically adjusts according to the concept drift and class imbalance ratio. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
156. Customer Order Behavior Classification Via Convolutional Neural Networks in the Semiconductor Industry.
- Author
-
Ratusny, Marco, Schiffer, Maximilian, and Ehm, Hans
- Subjects
- *
CONVOLUTIONAL neural networks , *SEMICONDUCTOR industry , *SUPPLY chain management , *CLASSIFICATION , *DATA mining - Abstract
In the operational processes of demand planning and order management, it is crucial to understand customer order behavior to provide insights for supply chain management processes. Here, advances in the semiconductor industry have emerged through the extraction of important information from vast amounts of data. This new data and information availability paves the way for the development of improved methods to analyze and classify customer order behavior (COB). To this end, we develop a novel, sophisticated yet intuitive image-based representation for COBs using two-dimensional heat maps. This heat map representation contributes significantly to the development of a novel COB classification framework. In this framework, we utilize data enrichment via synthetical training samples to train a CNN model that performs the classification task. Integrating synthetically generated data into the training phase allows us to strengthen the inclusion of rare pattern variants that we identified during initial analysis. Moreover, we show how this framework is used in practice at Infineon. We finally use actual customer data to benchmark the performance of our framework and show that the baseline CNN approach outperforms all available state-of-the-art benchmark models. Additionally, our results highlight the benefit of synthetic data enrichment. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
157. Adaptive estimation of external fields in reproducing kernel Hilbert spaces.
- Author
-
Guo, Jia, Kepler, Michael E., Tej Paruchuri, Sai, Wang, Hoaran, Kurdila, Andrew J., and Stilwell, Daniel J.
- Subjects
- *
DISTRIBUTED parameter systems , *EVOLUTION equations , *SENSOR networks , *HILBERT space , *KERNEL (Mathematics) - Abstract
Summary: This article studies the distributed parameter system that governs adaptive estimation by mobile sensor networks of external fields in a reproducing kernel Hilbert space (RKHS). The article begins with the derivation of conditions that guarantee the well‐posedness of the ideal, infinite dimensional governing equations of evolution for the centralized estimation scheme. Subsequently, convergence of finite dimensional approximations is studied. Rates of convergence in all formulations are established using history‐dependent bases defined from translates of the RKHS kernel that are centered at sample points along the agent trajectories. Sufficient conditions are derived that ensure that the finite dimensional approximations of the ideal estimator equations converge at a rate that is bounded by the fill distance of samples in the agents' assigned subdomains. The article concludes with examples of simulations and experiments that illustrate the qualitative performance of the introduced algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
158. Broad Learning System Approximation-Based Adaptive Optimal Control for Unknown Discrete-Time Nonlinear Systems.
- Author
-
Yuan, Liang'En, Li, Tieshan, Tong, Shaocheng, Xiao, Yang, and Shan, Qihe
- Subjects
- *
NONLINEAR systems , *DISCRETE-time systems , *INSTRUCTIONAL systems , *DYNAMIC programming , *SYSTEM dynamics , *ADAPTIVE control systems , *MULTICOLLINEARITY - Abstract
This article investigates optimal control problem for a class of discrete-time (DT) nonlinear systems with unknown dynamics. With the help of a broad learning system (BLS), a novel online adaptive dynamic programming (ADP) controller is presented. First, to approximate the unknown system dynamics, an approximator based on BLS is presented. The connection weights are calculated by the data of the system by using the ridge regression algorithm. Then, two BLSs are adopted to approximate the optimal cost function and optimal control law, respectively. The connection weights of these two BLSs are updated using the given weights tuning law at each sampling instant. The proposed optimal controller is proved to ensure that all the system states and estimation errors are uniform ultimate bounded. Finally, simulation examples are carried out to further demonstrate the effectiveness of the proposed BLS-based approximator and optimal controller. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
159. An Improved Inter-Intra Contrastive Learning Framework on Self-Supervised Video Representation.
- Author
-
Tao, Li, Wang, Xueting, and Yamasaki, Toshihiko
- Subjects
- *
OPTICAL flow , *VIDEO excerpts , *DATA augmentation , *VIDEOS , *PROJECTORS - Abstract
In this paper, we propose a self-supervised contrastive learning method to learn video feature representations. In traditional self-supervised contrastive learning methods, constraints from anchor, positive, and negative data pairs are used to train the model. In such a case, different samplings of the same video are treated as positives, and video clips from different videos are treated as negatives. Because the spatio-temporal information is important for video representation, we set the temporal constraints more strictly by introducing intra-negative samples. In addition to samples from different videos, negative samples are extended by breaking temporal relations in video clips from the same anchor video. With the proposed Inter-Intra Contrastive (IIC) framework, we can train spatio-temporal convolutional networks to learn feature representations from videos. Strong data augmentations, residual clips, as well as head projector are utilized to construct an improved version. Three kinds of intra-negative generation functions are proposed and extensive experiments using different network backbones are conducted on benchmark datasets. Without using pre-computed optical flow data, our improved version can outperform previous IIC by a large margin, such as 19.4% (from 36.8% to 56.2%) and 5.2% (from 15.5% to 20.7%) points improvements in top-1 accuracy on UCF101 and HMDB51 datasets for video retrieval, respectively. For video recognition, over 3% points improvements can also be obtained on these two benchmark datasets. Discussions and visualizations validate that our IICv2 can capture better temporal clues and indicate the potential mechanism. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
160. Multiview Subspace Clustering With Grouping Effect.
- Author
-
Chen, Man-Sheng, Huang, Ling, Wang, Chang-Dong, Huang, Dong, and Yu, Philip S.
- Abstract
Multiview subspace clustering (MVSC) is a recently emerging technique that aims to discover the underlying subspace in multiview data and thereby cluster the data based on the learned subspace. Though quite a few MVSC methods have been proposed in recent years, most of them cannot explicitly preserve the locality in the learned subspaces and also neglect the subspacewise grouping effect, which restricts their ability of multiview subspace learning. To address this, in this article, we propose a novel MVSC with grouping effect (MvSCGE) approach. Particularly, our approach simultaneously learns the multiple subspace representations for multiple views with smooth regularization, and then exploits the subspacewise grouping effect in these learned subspaces by means of a unified optimization framework. Meanwhile, the proposed approach is able to ensure the cross-view consistency and learn a consistent cluster indicator matrix for the final clustering results. Extensive experiments on several benchmark datasets have been conducted to validate the superiority of the proposed approach. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
161. GCN-Based Pavement Crack Detection Using Mobile LiDAR Point Clouds.
- Author
-
Feng, Huifang, Li, Wen, Luo, Zhipeng, Chen, Yiping, Fatholahi, Sarah Narges, Cheng, Ming, Wang, Cheng, Junior, Jose Marcato, and Li, Jonathan
- Abstract
Mobile Laser Scanning (MLS) system can provide high-density and accurate 3D point clouds that enable rapid pavement crack detection for road maintenance tasks. Supervised learning-based algorithms have been proved pretty effective for handling such a large amount of inhomogeneous and unstructured point clouds. However, these algorithms often rely on a lot of annotated data, which is labor-intensive and time-consuming. This paper presents a semi-supervised point-level approach to overcome this challenge. We propose a graph-widen module to construct a reasonable graph structure for point clouds, increasing the detection performance of graph convolutional networks (GCN). The constructed graph characterizes the local features from a small amount of annotated data, avoiding information loss and dramatically reduces the dependence on annotated data. The MLS point clouds acquired by a commercial RIEGL VMX-450 system are used in this study. The experimental results demonstrate that our method outperforms the state-of-the-art point-level methods in terms of recall, F1 score, and efficiency while achieving comparable accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
162. CTD: Cascaded Temporal Difference Learning for the Mean-Standard Deviation Shortest Path Problem.
- Author
-
Guo, Hongliang, Hou, Xuejie, and Peng, Qihang
- Abstract
This paper investigates the reliable shortest path (RSP) planning problem from the reinforcement learning perspective. Different from canonical path planning methods, which require at least the first- order statistic (mean) and second-order statistic (variance) information of travel time distribution, we target at the RSP planning problem without the assumption of knowing any travel time distribution characteristic beforehand, and propose a cascaded temporal difference learning (CTD) method, which simultaneously estimates the mean and variance of the executing path and thereby gradually makes improvements through the generalized policy iteration (GPI) scheme, as the ego vehicle interacts with the environment. Extensive simulation results demonstrate the applicability of the proposed method for RSP learning in various transportation networks. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
163. Broad Learning Based Dynamic Fuzzy Inference System With Adaptive Structure and Interpretable Fuzzy Rules.
- Author
-
Bai, Kaiyuan, Zhu, Xiaomin, Wen, Shiping, Zhang, Runtong, and Zhang, Wenyu
- Subjects
FUZZY logic ,SMART structures ,FUZZY systems ,PARSIMONIOUS models ,ARTIFICIAL neural networks ,MACHINE learning - Abstract
This article investigates the feasibility of applying the broad learning system (BLS) to realize a novel Takagi–Sugeno–Kang (TSK) neuro-fuzzy model, namely a broad learning based dynamic fuzzy inference system (BL-DFIS). It not only improves the accuracy and interpretability of neuro-fuzzy models but also solves the challenging problem that models are incapable of determining the optimal architecture autonomously. BL-DFIS first accomplishes a TSK fuzzy system under the framework of BLS, in which an extreme learning machine auto-encoder is employed to obtain feature representation in a fast and analytical way, and an interpretable linguistic fuzzy rule is integrated into the enhancement node to ensure the high interpretability of the system. Meanwhile, the extended-enhancement unit is designed to achieve the first-order TSK fuzzy system. In addition, a dynamic incremental learning algorithm with internal pruning and updating mechanism is developed for the learning of BL-DFIS, which enables the system to automatically assemble the optimal structure to obtain a compact rule base and an excellent classification performance. Experiments on benchmark datasets demonstrate that the proposed BL-DFIS can achieve a better classification performance than some state-of-the-art nonfuzzy and neuro-fuzzy methods, simultaneously using the most parsimonious model structure. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
164. Incomplete Multiple View Fuzzy Inference System With Missing View Imputation and Cooperative Learning.
- Author
-
Zhang, Wei, Deng, Zhaohong, Zhang, Te, Choi, Kup-Sze, Wang, Jun, and Wang, Shitong
- Subjects
FUZZY logic ,MULTIPLE imputation (Statistics) ,FUZZY systems ,GROUP work in education ,MISSING data (Statistics) - Abstract
Advancement of technology has made available data of different modalities that can be integrated effectively through multiple view learning for modeling real-world problems. Although multiple view learning has achieved great success in many applications, it still faces several challenges. One of them is how to reduce the negative impact of the missing views in incomplete multiple view datasets by fully exploiting the information available. Another challenge is how to enhance the interpretability of the multiple view model for scenarios with high transparency requirement. To address these challenges, this article proposes a novel modeling method for incomplete multiple view fuzzy system. Based on fuzzy interpretable rules, the method integrates missing view imputation and hidden view learning as one single process to yield a model of high interpretability, where cooperative learning is used to mine the complementary information between the visible views and the hidden view. The proposed method has four advantages when compared with existing approaches: 1) the method is more interpretable, attributed to the fuzzy interpretable rules that it is based on, 2) missing view imputation is integrated into the modeling to make it more efficient than the existing two-step strategy, 3) the method not only imputes missing views, but also mines the hidden view shared by the multiple visible views, and 4) cooperative learning is used to mine the complementary information, which significantly reduces the negative impact of missing views. Experiments on real datasets demonstrate the advantages of the proposed method. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
165. Multi-View Consensus Proximity Learning for Clustering.
- Author
-
Liu, Bao-Yu, Huang, Ling, Wang, Chang-Dong, Lai, Jian-Huang, and Yu, Philip S.
- Subjects
- *
INITIAL value problems , *LAPLACIAN matrices - Abstract
Most proximity-based multi-view clustering methods are sensitive to the initial proximity matrix, where the clustering performance is quite unstable when using different initial proximity matrixes. This problem is defined as the initial value sensitivity problem. Since clustering is an unsupervised learning task, it is unrealistic to tune the initial proximity matrix. Thus, how to overcome the initial value sensitivity problem is a significant but unsolved issue in the proximity-based multi-view clustering. To this end, this paper proposes a novel multi-view proximity learning method, named multi-view consensus proximity learning (MCPL). On the one hand, by integrating the information of all views in a self-weighted manner and giving a rank constraint on the Laplacian matrix, the MCPL method learns the consensus proximity matrix to directly reflect the clustering result. On the other hand, different from most multi-view proximity learning methods, in the proposed MCPL method, the data representatives rather than the original data objects are adopted to learn the consensus proximity matrix. The data representatives will be updated in the process of the proximity learning so as to weaken the impact of the initial value on the clustering performance. Extensive experiments are conducted to demonstrate the effectiveness of the proposed method. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
166. Learning Inter- and Intra-Manifolds for Matrix Factorization-Based Multi-Aspect Data Clustering.
- Author
-
Luong, Khanh and Nayak, Richi
- Subjects
- *
NONNEGATIVE matrices , *MATRIX decomposition , *SPARSE matrices , *SYMMETRIC matrices , *HYPERTEXT systems - Abstract
Clustering on the data with multiple aspects, such as multi-view or multi-type relational data, has become popular in recent years due to their wide applicability. The approach using manifold learning with the Non-negative Matrix Factorization (NMF) framework, that learns the accurate low-rank representation of the multi-dimensional data, has shown effectiveness. We propose to include the inter-manifold in the NMF framework, utilizing the distance information of data points of different data types (or views) to learn the diverse manifold for data clustering. Empirical analysis reveals that the proposed method can find partial representations of various interrelated types and select useful features during clustering. Results on several datasets demonstrate that the proposed method outperforms the state-of-the-art multi-aspect data clustering methods in both accuracy and efficiency. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
167. Spectral–Spatial Classification of Few Shot Hyperspectral Image With Deep 3-D Convolutional Random Fourier Features Network.
- Author
-
Wang, Tingting, Liu, Huanyu, and Li, Junbao
- Subjects
- *
DEEP learning , *THREE-dimensional imaging , *REMOTE sensing , *FEATURE extraction , *LAND cover , *CLASSIFICATION algorithms , *SUPPORT vector machines - Abstract
Remote sensing hyperspectral images are very useful for land cover classification because of their rich spatial and spectral information. However, hyperspectral image acquisition and pixel labeling are laborious and time-consuming, so few-shot learning methods are considered to solve this problem. Deep learning has gradually been used for few-shot hyperspectral classification, but there are some problems. The feature extraction network based on deep learning requires too many parameters to be trained, resulting in a huge network model, which is not conducive to deployment on remote sensing data acquisition equipment. Moreover, due to the lack of label samples, the algorithm based on deep learning is more prone to overfitting. To solve the above problems, considering the advanced characteristics of the kernel method in dealing with nonlinear, small sample, and high-dimensional data, we propose a small scale high precision network called 3-D convolution random Fourier features (3-DCRFF) based on the random Fourier feature (RFF) kernel approximation, which is the 3-DCRFF network. First, we combine 3-D convolution with RFF as the basic structure of the network to extract the spatial and spectral features of HSI cubes. Second, we use a classifier based on attention mechanism to classify feature vectors to obtain recognition probability. Finally, the network parameters are solved from the perspective of Bayesian optimization, and the synthetic gradient optimization method is designed and implemented to realize the fast learning of the network. A large number of experiments HSI classification experiments were performed on University of Pavia (UP), Pavia Center (PC), Indian Pines (IP), and Salinas standard remote sensing datasets, the results show that our algorithm outperforms most state-of-the-art algorithms on few-shot classification. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
168. Canonical Correlation Analysis With Low-Rank Learning for Image Representation.
- Author
-
Lu, Yuwu, Wang, Wenjing, Zeng, Biqing, Lai, Zhihui, Shen, Linlin, and Li, Xuelong
- Subjects
- *
IMAGE representation , *STATISTICAL correlation , *PATTERN recognition systems , *MATRICES (Mathematics) , *EUCLIDEAN metric , *EUCLIDEAN distance - Abstract
As a multivariate data analysis tool, canonical correlation analysis (CCA) has been widely used in computer vision and pattern recognition. However, CCA uses Euclidean distance as a metric, which is sensitive to noise or outliers in the data. Furthermore, CCA demands that the two training sets must have the same number of training samples, which limits the performance of CCA-based methods. To overcome these limitations of CCA, two novel canonical correlation learning methods based on low-rank learning are proposed in this paper for image representation, named robust canonical correlation analysis (robust-CCA) and low-rank representation canonical correlation analysis (LRR-CCA). By introducing two regular matrices, the training sample numbers of the two training datasets can be set as any values without any limitation in the two proposed methods. Specifically, robust-CCA uses low-rank learning to remove the noise in the data and extracts the maximization correlation features from the two learned clean data matrices. The nuclear norm and $L_{1}$ -norm are used as constraints for the learned clean matrices and noise matrices, respectively. LRR-CCA introduces low-rank representation into CCA to ensure that the correlative features can be obtained in low-rank representation. To verify the performance of the proposed methods, five publicly image databases are used to conduct extensive experiments. The experimental results demonstrate the proposed methods outperform state-of-the-art CCA-based and low-rank learning methods. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
169. Depth Map Recovery Based on a Unified Depth Boundary Distortion Model.
- Author
-
Wang, Haotian, Yang, Meng, Lan, Xuguang, Zhu, Ce, and Zheng, Nanning
- Subjects
- *
GEOGRAPHIC boundaries , *TASK analysis , *IMAGE analysis - Abstract
Depth maps acquired by either physical sensors or learning methods are often seriously distorted due to boundary distortion problems, including missing, fake, and misaligned boundaries (compared with RGB images). An RGB-guided depth map recovery method is proposed in this paper to recover true boundaries in seriously distorted depth maps. Therefore, a unified model is first developed to observe all these kinds of distorted boundaries in depth maps. Observing distorted boundaries is equivalent to identifying erroneous regions in distorted depth maps, because depth boundaries are essentially formed by contiguous regions with different intensities. Then, erroneous regions are identified by separately extracting local structures of RGB image and depth map with Gaussian kernels and comparing their similarity on the basis of the SSIM index. A depth map recovery method is then proposed on the basis of the unified model. This method recovers true depth boundaries by iteratively identifying and correcting erroneous regions in recovered depth map based on the unified model and a weighted median filter. Because RGB image generally includes additional textural contents compared with depth maps, texture-copy artifacts problem is further addressed in the proposed method by restricting the model works around depth boundaries in each iteration. Extensive experiments are conducted on five RGB–depth datasets including depth map recovery, depth super-resolution, depth estimation enhancement, and depth completion enhancement. The results demonstrate that the proposed method considerably improves both the quantitative and visual qualities of recovered depth maps in comparison with fifteen competitive methods. Most object boundaries in recovered depth maps are corrected accurately, and kept sharply and well aligned with the ones in RGB images. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
170. Pseudo Decoder Guided Light-Weight Architecture for Image Inpainting.
- Author
-
Phutke, Shruti S. and Murala, Subrahmanyam
- Subjects
- *
INPAINTING , *HIGH resolution imaging , *SOURCE code , *VIDEO coding - Abstract
Image inpainting is one of the most important and widely used approaches where input image is synthesized at the missing regions. This has various applications like undesired object removal, virtual garment shopping, etc. The methods used for image inpainting may use the knowledge of hole locations to effectively regenerate contents in an image. Existing image inpainting methods give astonishing results with coarse-to-fine architectures or with use of guided information like edges, structures, etc. The coarse-to-fine architectures require umpteen resources leading to high computation cost of the architecture. Other methods with edge or structural information depend on the available models to generate guiding information for inpainting. In this context, we have proposed computationally efficient, light-weight network for image inpainting with very less number of parameters (0.97M) and without any guided information. The proposed architecture consists of the multi-encoder level feature fusion module, pseudo decoder and regeneration decoder. The encoder multi level feature fusion module extracts relevant information from each of the encoder levels to merge structural and textural information from various receptive fields. This information is then processed with pseudo decoder followed by space depth correlation module to assist regeneration decoder for inpainting task. The experiments are performed with different types of masks and compared with the state-of-the-art methods on three benchmark datasets i.e., Paris Street View (PARIS_SV), Places2 and CelebA_HQ. Along with this, the proposed network is tested on high resolution images ($1024\times1024$ and 2048 $\times2048$) and compared with the existing methods. The extensive comparison with state-of-the-art methods, computational complexity analysis, and ablation study prove the effectiveness of the proposed framework for image inpainting. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
171. Discrete Metric Learning for Fast Image Set Classification.
- Author
-
Wei, Dong, Shen, Xiaobo, Sun, Quansen, and Gao, Xizhan
- Subjects
- *
IMAGE recognition (Computer vision) , *OBJECT recognition (Computer vision) , *RECOGNITION (Psychology) , *HAMMING distance , *METRIC spaces , *RIEMANNIAN manifolds - Abstract
In the field of image set classification, most existing works focus on exploiting effective latent discriminative features. However, it remains a research gap to efficiently handle this problem. In this paper, benefiting from the superiority of hashing in terms of its computational complexity and memory costs, we present a novel Discrete Metric Learning (DML) approach based on the Riemannian manifold for fast image set classification. The proposed DML jointly learns a metric in the induced space and a compact Hamming space, where efficient classification is carried out. Specifically, each image set is modeled as a point on Riemannian manifold after which the proposed DML minimizes the Hamming distance between similar Riemannian pairs and maximizes the Hamming distance between dissimilar ones by introducing a discriminative Mahalanobis-like matrix. To overcome the shortcoming of DML that relies on the vectorization of Riemannian representations, we further develop Bilinear Discrete Metric Learning (BDML) to directly manipulate the original Riemannian representations and explore the natural matrix structure for high-dimensional data. Different from conventional Riemannian metric learning methods, which require complicated Riemannian optimizations (e.g., Riemannian conjugate gradient), both DML and BDML can be efficiently optimized by computing the geodesic mean between the similarity matrix and inverse of the dissimilarity matrix. Extensive experiments conducted on different visual recognition tasks (face recognition, object recognition, and action recognition) demonstrate that the proposed methods achieve competitive performance in terms of accuracy and efficiency. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
172. HQ2CL: A High-Quality Class Center Learning System for Deep Face Recognition.
- Author
-
Lv, Xianwei, Yu, Chen, Jin, Hai, and Liu, Kai
- Subjects
- *
DEEP learning , *FACE perception , *INSTRUCTIONAL systems - Abstract
Benefited from the proposals of function losses margin-based, face recognition has achieved significant improvements in recent years. Those losses aim to increase the margin between the different identities to enhance the discriminability. Ideally, the class center of different identities is far from each other, and face samples are compact around the corresponding class center. Hence, it’s very vital to produce a high-quality class center. However, the distribution of training sets determines the class center. With low-quality samples being in the majority, the class center would be close to the samples with little identity information. As a result, it would impair the discriminability of the learned model for those unseen samples. In this work, we propose a High-Quality Class Center Learning system (HQ2CL). This is an effective system and guides the class center to approach the high-quality samples to keep the discriminability. Specifically, HQ2CL introduces a quality-aware scale and margin layer for the identification loss and constructs a new high-quality center loss. We implement the proposed system without additional burden. And we present the experimental evaluation over different face benchmarks. The experimental results show the superiority of our proposed HQ2CL over the state-of-the-arts. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
173. Low-Light Enhancement Using a Plug-and-Play Retinex Model With Shrinkage Mapping for Illumination Estimation.
- Author
-
Lin, Yi-Hsien and Lu, Yi-Chang
- Subjects
- *
LIGHTING , *PROBLEM solving , *REFLECTANCE , *LINEAR programming , *IMAGE intensifiers - Abstract
Low-light photography conditions degrade image quality. This study proposes a novel Retinex-based low-light enhancement method to correctly decompose an input image into reflectance and illumination. Subsequently, we can improve the viewing experience by adjusting the illumination using intensity and contrast enhancement. Because image decomposition is a highly ill-posed problem, constraints must be properly imposed on the optimization framework. To meet the criteria of ideal Retinex decomposition, we design a nonconvex $L_{p}$ norm and apply shrinkage mapping to the illumination layer. In addition, edge-preserving filters are introduced using the plug-and-play technique to improve illumination. Pixel-wise weights based on variance and image gradients are adopted to suppress noise and preserve details in the reflectance layer. We choose the alternating direction method of multipliers (ADMM) to solve the problem efficiently. Experimental results on several challenging low-light datasets show that our proposed method can more effectively enhance image brightness as compared with state-of-the-art methods. In addition to subjective observations, the proposed method also achieved competitive performance in objective image quality assessments. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
174. Learning Calibrated Class Centers for Few-Shot Classification by Pair-Wise Similarity.
- Author
-
Guo, Yurong, Du, Ruoyi, Li, Xiaoxu, Xie, Jiyang, Ma, Zhanyu, and Dong, Yuan
- Subjects
- *
IMAGE recognition (Computer vision) , *APPROXIMATION error , *CLUSTER sampling , *FEATURE extraction , *CLASSIFICATION , *NAIVE Bayes classification - Abstract
Metric-based methods achieve promising performance on few-shot classification by learning clusters on support samples and generating shared decision boundaries for query samples. However, existing methods ignore the inaccurate class center approximation introduced by the limited number of support samples, which consequently leads to biased inference. Therefore, in this paper, we propose to reduce the approximation error by class center calibration. Specifically, we introduce the so-called Pair-wise Similarity Module (PSM) to generate calibrated class centers adapted to the query sample by capturing the semantic correlations between the support and the query samples, as well as enhancing the discriminative regions on support representation. It is worth noting that the proposed PSM is a simple plug-and-play module and can be inserted into most metric-based few-shot learning models. Through extensive experiments in metric-based models, we demonstrate that the module significantly improves the performance of conventional few-shot classification methods on four few-shot image classification benchmark datasets. Codes are available at: https://github.com/PRIS-CV/Pair-wise-Similarity-module. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
175. Underwater Image Enhancement via Minimal Color Loss and Locally Adaptive Contrast Enhancement.
- Author
-
Zhang, Weidong, Zhuang, Peixian, Sun, Hai-Han, Li, Guohou, Kwong, Sam, and Li, Chongyi
- Subjects
- *
IMAGE enhancement (Imaging systems) , *IMAGE intensifiers , *COLOR space , *LIGHT absorption , *IMAGE color analysis , *IMAGE segmentation - Abstract
Underwater images typically suffer from color deviations and low visibility due to the wavelength-dependent light absorption and scattering. To deal with these degradation issues, we propose an efficient and robust underwater image enhancement method, called MLLE. Specifically, we first locally adjust the color and details of an input image according to a minimum color loss principle and a maximum attenuation map-guided fusion strategy. Afterward, we employ the integral and squared integral maps to compute the mean and variance of local image blocks, which are used to adaptively adjust the contrast of the input image. Meanwhile, a color balance strategy is introduced to balance the color differences between channel a and channel b in the CIELAB color space. Our enhanced results are characterized by vivid color, improved contrast, and enhanced details. Extensive experiments on three underwater image enhancement datasets demonstrate that our method outperforms the state-of-the-art methods. Our method is also appealing in its fast processing speed within 1s for processing an image of size $1024\times 1024 \times 3$ on a single CPU. Experiments further suggest that our method can effectively improve the performance of underwater image segmentation, keypoint detection, and saliency detection. The project page is available at https://li-chongyi.github.io/proj [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
176. Unsupervised Meta Learning With Multiview Constraints for Hyperspectral Image Small Sample set Classification.
- Author
-
Gao, Kuiliang, Liu, Bing, Yu, Xuchu, and Yu, Anzhu
- Subjects
- *
DEEP learning , *SUPERVISED learning , *MACHINE learning , *CLASSIFICATION - Abstract
The difficulties of obtaining sufficient labeled samples have always been one of the factors hindering deep learning models from obtaining high accuracy in hyperspectral image (HSI) classification. To reduce the dependence of deep learning models on training samples, meta learning methods have been introduced, effectively improving the classification accuracy in small sample set scenarios. However, the existing methods based on meta learning still need to construct a labeled source data set with several pre-collected HSIs, and must utilize a large number of labeled samples for meta-training, which is actually time-consuming and labor-intensive. To solve this problem, this paper proposes a novel unsupervised meta learning method with multiview constraints for HSI small sample set classification. Specifically, the proposed method first builds an unlabeled source data set using unlabeled HSIs. Then, multiple spatial-spectral multiview features of each unlabeled sample are generated to construct tasks for unsupervised meta learning. Finally, the designed residual relation network is used for meta-training and small sample set classification based on the voting strategy. Compared with existing supervised meta learning methods for HSI classification, our method can only utilize HSIs without any label for unsupervised meta learning, which significantly reduces the number of requisite labeled samples in the whole classification process. To verify the effectiveness of the proposed method, extensive experiments are carried out on 8 public HSIs in the cross-domain and in-domain classification scenarios. The statistical results demonstrate that, compared with existing supervised meta learning methods and other advanced classification models, the proposed method can achieve competitive or better classification performance in small sample set scenarios. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
177. Occlusion-Aware Unsupervised Learning of Depth From 4-D Light Fields.
- Author
-
Jin, Jing and Hou, Junhui
- Subjects
- *
KERNEL (Mathematics) , *GRAPHICS processing units , *COHERENCE (Physics) , *OCCLUSION (Chemistry) - Abstract
Depth estimation is a fundamental issue in 4-D light field processing and analysis. Although recent supervised learning-based light field depth estimation methods have significantly improved the accuracy and efficiency of traditional optimization-based ones, these methods rely on the training over light field data with ground-truth depth maps which are challenging to obtain or even unavailable for real-world light field data. Besides, due to the inevitable gap (or domain difference) between real-world and synthetic data, they may suffer from serious performance degradation when generalizing the models trained with synthetic data to real-world data. By contrast, we propose an unsupervised learning-based method, which does not require ground-truth depth as supervision during training. Specifically, based on the basic knowledge of the unique geometry structure of light field data, we present an occlusion-aware strategy to improve the accuracy on occlusion areas, in which we explore the angular coherence among subsets of the light field views to estimate initial depth maps, and utilize a constrained unsupervised loss to learn their corresponding reliability for final depth prediction. Additionally, we adopt a multi-scale network with a weighted smoothness loss to handle the textureless areas. Experimental results on synthetic data show that our method can significantly shrink the performance gap between the previous unsupervised method and supervised ones, and produce depth maps with comparable accuracy to traditional methods with obviously reduced computational cost. Moreover, experiments on real-world datasets show that our method can avoid the domain shift problem presented in supervised methods, demonstrating the great potential of our method. The code will be publicly available at https://github.com/jingjin25/LFDE-OccUnNet. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
178. TDPN: Texture and Detail-Preserving Network for Single Image Super-Resolution.
- Author
-
Cai, Qing, Li, Jinxing, Li, Huafeng, Yang, Yee-Hong, Wu, Feng, and Zhang, David
- Subjects
- *
HIGH resolution imaging , *CONVOLUTIONAL neural networks , *GENERATIVE adversarial networks , *SIGNAL-to-noise ratio - Abstract
Single image super-resolution (SISR) using deep convolutional neural networks (CNNs) achieves the state-of-the-art performance. Most existing SISR models mainly focus on pursuing high peak signal-to-noise ratio (PSNR) and neglect textures and details. As a result, the recovered images are often perceptually unpleasant. To address this issue, in this paper, we propose a texture and detail-preserving network (TDPN), which focuses not only on local region feature recovery but also on preserving textures and details. Specifically, the high-resolution image is recovered from its corresponding low-resolution input in two branches. First, a multi-reception field based branch is designed to let the network fully learn local region features by adaptively selecting local region features in different reception fields. Then, a texture and detail-learning branch supervised by the textures and details decomposed from the ground-truth high resolution image is proposed to provide additional textures and details for the super-resolution process to improve the perceptual quality. Finally, we introduce a gradient loss into the SISR field and define a novel hybrid loss to strengthen boundary information recovery and to avoid overly smooth boundary in the final recovered high-resolution image caused by using only the MAE loss. More importantly, the proposed method is model-agnostic, which can be applied to most off-the-shelf SISR networks. The experimental results on public datasets demonstrate the superiority of our TDPN on most state-of-the-art SISR methods in PSNR, SSIM and perceptual quality. We will share our code on https://github.com/tocaiqing/TDPN. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
179. Diverse Complementary Part Mining for Weakly Supervised Object Localization.
- Author
-
Meng, Meng, Zhang, Tianzhu, Yang, Wenfei, Zhao, Jian, Zhang, Yongdong, and Wu, Feng
- Subjects
- *
PRODUCT management software , *SCALABILITY - Abstract
Weakly Supervised Object Localization (WSOL) aims to localize objects with only image-level labels, which has better scalability and practicability than fully supervised methods in the actual deployment. However, a common limitation for available techniques based on classification networks is that they only highlight the most discriminative part of the object, not the entire object. To alleviate this problem, we propose a novel end-to-end part discovery model (PDM) to learn multiple discriminative object parts in a unified network for accurate object localization and classification. The proposed PDM enjoys several merits. First, to the best of our knowledge, it is the first work to directly model diverse and robust object parts by exploiting part diversity, compactness, and importance jointly for WSOL. Second, three effective mechanisms including diversity, compactness, and importance learning mechanisms are designed to learn robust object parts. Therefore, our model can exploit complementary spatial information and local details from the learned object parts, which help to produce precise bounding boxes and discriminate different object categories. Extensive experiments on two standard benchmarks demonstrate that our PDM performs favorably against state-of-the-art WSOL approaches. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
180. Information Symmetry Matters: A Modal-Alternating Propagation Network for Few-Shot Learning.
- Author
-
Ji, Zhong, Hou, Zhishen, Liu, Xiyao, Pang, Yanwei, and Han, Jungong
- Subjects
- *
SYMMETRY , *PETRI nets , *INFORMATION asymmetry , *DATABASES , *SEMANTICS , *KNOWLEDGE transfer - Abstract
Semantic information provides intra-class consistency and inter-class discriminability beyond visual concepts, which has been employed in Few-Shot Learning (FSL) to achieve further gains. However, semantic information is only available for labeled samples but absent for unlabeled samples, in which the embeddings are rectified unilaterally by guiding the few labeled samples with semantics. Therefore, it is inevitable to bring a cross-modal bias between semantic-guided samples and nonsemantic-guided samples, which results in an information asymmetry problem. To address this problem, we propose a Modal-Alternating Propagation Network (MAP-Net) to supplement the absent semantic information of unlabeled samples, which builds information symmetry among all samples in both visual and semantic modalities. Specifically, the MAP-Net transfers the neighbor information by the graph propagation to generate the pseudo-semantics for unlabeled samples guided by the completed visual relationships and rectify the feature embeddings. In addition, due to the large discrepancy between visual and semantic modalities, we design a Relation Guidance (RG) strategy to guide the visual relation vectors via semantics so that the propagated information is more beneficial. Extensive experimental results on three semantic-labeled datasets, i.e., Caltech-UCSD-Birds 200-2011, SUN Attribute Database and Oxford 102 Flower, have demonstrated that our proposed method achieves promising performance and outperforms the state-of-the-art approaches, which indicates the necessity of information symmetry. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
181. A Continual Learning Survey: Defying Forgetting in Classification Tasks.
- Author
-
De Lange, Matthias, Aljundi, Rahaf, Masana, Marc, Parisot, Sarah, Jia, Xu, Leonardis, Ales, Slabaugh, Gregory, and Tuytelaars, Tinne
- Subjects
- *
TASKS , *ARTIFICIAL neural networks - Abstract
Artificial neural networks thrive in solving the classification problem for a particular rigid task, acquiring knowledge through generalized learning behaviour from a distinct training phase. The resulting network resembles a static entity of knowledge, with endeavours to extend this knowledge without targeting the original task resulting in a catastrophic forgetting. Continual learning shifts this paradigm towards networks that can continually accumulate knowledge over different tasks without the need to retrain from scratch. We focus on task incremental classification, where tasks arrive sequentially and are delineated by clear boundaries. Our main contributions concern: (1) a taxonomy and extensive overview of the state-of-the-art; (2) a novel framework to continually determine the stability-plasticity trade-off of the continual learner; (3) a comprehensive experimental comparison of 11 state-of-the-art continual learning methods; and (4) baselines. We empirically scrutinize method strengths and weaknesses on three benchmarks, considering Tiny Imagenet and large-scale unbalanced iNaturalist and a sequence of recognition datasets. We study the influence of model capacity, weight decay and dropout regularization, and the order in which the tasks are presented, and qualitatively compare methods in terms of required memory, computation time, and storage. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
182. Structure-Aware Long Short-Term Memory Network for 3D Cephalometric Landmark Detection.
- Author
-
Chen, Runnan, Ma, Yuexin, Chen, Nenglun, Liu, Lingjie, Cui, Zhiming, Lin, Yanhong, and Wang, Wenping
- Subjects
- *
CONE beam computed tomography - Abstract
Detecting 3D landmarks on cone-beam computed tomography (CBCT) is crucial to assessing and quantifying the anatomical abnormalities in 3D cephalometric analysis. However, the current methods are time-consuming and suffer from large biases in landmark localization, leading to unreliable diagnosis results. In this work, we propose a novel Structure-Aware Long Short-Term Memory framework (SA-LSTM) for efficient and accurate 3D landmark detection. To reduce the computational burden, SA-LSTM is designed in two stages. It first locates the coarse landmarks via heatmap regression on a down-sampled CBCT volume and then progressively refines landmarks by attentive offset regression using multi-resolution cropped patches. To boost accuracy, SA-LSTM captures global-local dependence among the cropping patches via self-attention. Specifically, a novel graph attention module implicitly encodes the landmark’s global structure to rationalize the predicted position. Moreover, a novel attention-gated module recursively filters irrelevant local features and maintains high-confident local predictions for aggregating the final result. Experiments conducted on an in-house dataset and a public dataset show that our method outperforms state-of-the-art methods, achieving 1.64 mm and 2.37 mm average errors, respectively. Furthermore, our method is very efficient, taking only 0.5 seconds for inferring the whole CBCT volume of resolution $768\times 768\times 576$. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
183. NPCNet: Jointly Segment Primary Nasopharyngeal Carcinoma Tumors and Metastatic Lymph Nodes in MR Images.
- Author
-
Li, Yang, Dan, Tingting, Li, Haojiang, Chen, Jiazhou, Peng, Hong, Liu, Lizhi, and Cai, Hongmin
- Subjects
- *
NASOPHARYNX cancer , *LYMPH nodes , *MAGNETIC resonance imaging , *TUMORS , *METASTASIS - Abstract
Nasopharyngeal carcinoma (NPC) is a malignant tumor whose survivability is greatly improved if early diagnosis and timely treatment are provided. Accurate segmentation of both the primary NPC tumors and metastatic lymph nodes (MLNs) is crucial for patient staging and radiotherapy scheduling. However, existing studies mainly focus on the segmentation of primary tumors, eliding the recognition of MLNs, and thus fail to comprehensively provide a landscape for tumor identification. There are three main challenges in segmenting primary NPC tumors and MLNs: variable location, variable size, and irregular boundary. To address these challenges, we propose an automatic segmentation network, named by NPCNet, to achieve segmentation of primary NPC tumors and MLNs simultaneously. Specifically, we design three modules, including position enhancement module (PEM), scale enhancement module (SEM), and boundary enhancement module (BEM), to address the above challenges. First, the PEM enhances the feature representations of the most suspicious regions. Subsequently, the SEM captures multiscale context information and target context information. Finally, the BEM rectifies the unreliable predictions in the segmentation mask. To that end, extensive experiments are conducted on our dataset of 9124 samples collected from 754 patients. Empirical results demonstrate that each module realizes its designed functionalities and is complementary to the others. By incorporating the three proposed modules together, our model achieves state-of-the-art performance compared with nine popular models. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
184. Global-Local Interplay in Semantic Alignment for Few-Shot Learning.
- Author
-
Hao, Fusheng, He, Fengxiang, Cheng, Jun, and Tao, Dacheng
- Subjects
- *
INFORMATION design , *SEMANTICS - Abstract
Few-shot learning aims to recognize novel classes from only a few labeled training examples. Aligning semantically relevant local regions has shown promise in effectively comparing a query image with support images. However, global information is usually overlooked in the existing approaches, resulting in a higher possibility of learning semantics unrelated to the global information. To address this issue, we propose a Global-Local Interplay Metric Learning (GLIML) framework to employ the interplay between global features and local features to guide semantic alignment. We first design a Global-Local Information Concurrent Learning (GLICL) module to extract both global features and local features and perform global-local interplay. We then design a Global-Local Information Cross-Covariance Estimator (GLICCE) to learn the similarity on the global-local interplay, in contrast to the current practice where only local features are considered. Visualizations show that the global-local interplay decreases (1) the weights placed on the semantics that are irrelevant to the global information and (2) the variability of the learned features within every class in the feature space. Quantitative experiments on three benchmark datasets demonstrate that GLIML achieves state-of-the-art performance while maintaining high efficiency. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
185. Learning Hybrid Semantic Affinity for Point Cloud Segmentation.
- Author
-
Song, Zhanjie, Zhao, Linqing, and Zhou, Jie
- Subjects
- *
POINT cloud , *BLENDED learning , *VIDEO coding , *IMAGE segmentation , *NEIGHBORHOODS , *TASK analysis - Abstract
In this paper, we present a hybrid semantic affinity learning method (HSA) to capture and leverage the dependencies of categories for 3D semantic segmentation. Unlike existing methods that only use the cross-entropy loss to perform one-to-one supervision and ignore the semantic relations between points, our approach aims to learn the label dependencies between 3D points from a hybrid perspective. From a global view, we introduce the structural correlations among different classes to provide global priors for point features. Specifically, we fuse word embeddings of labels and scene-level features as category nodes, which are processed via a graph convolutional network (GCN) to produce the sample-adapted global priors. These priors are then combined with point features to enhance the rationality of semantic predictions. From a local view, we propose the concept of local affinity to effectively model the intra-class and inter-class semantic similarities for adjacent neighborhoods, making the predictions more discriminative. Experimental results show that our method consistently improves the performance of state-of-the-art models across indoor (S3DIS, ScanNet), outdoor (SemanticKITTI), and synthetic (ShapeNet) datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
186. The Secrets of Data Science Deployments.
- Author
-
Fayyad, Usama M. and Fayyad, Usama
- Subjects
PRAGMATICS ,DATA science ,MACHINE learning ,INDUSTRIAL costs ,COVID-19 pandemic ,TRUST - Abstract
Much attention is paid to data science and machine learning as an effective means for getting value out of data and as a means for dealing with the large amounts of data we are accumulating at companies and organizations. This has gained importance with the major waves of digitization we have seen, especially with the COVID-19 pandemic accelerating digital everything. However, in reality, most machine learning models, despite achieving good technical solutions to predictive problems wind up not being deployed. The reasons for this are many and have their origin in data scientists and machine learning practitioners not paying enough attention to issues of deployment in production. The issues range all the way from establishing trust by business stakeholders and users, to failure to explain why models work and when they do not, to failing to appreciate the importance of establishing a robust quality data pipeline, to ignoring many constraints that apply to deployed models, and finally to a lack of understanding the true cost of production deployment and the associated ROI. We discuss many of these problems and we provide what we believe is a pragmatic approach to getting data science models successfully deployed in working environments. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
187. Information and Communication Technology in Elementary Schools: A Comparison Between Hybrid and Face-to-Face Learning Systems.
- Author
-
Zakaria, Wisnu, Turmudi, and Pentang, Jupeth Toriano
- Subjects
INSTRUCTIONAL systems ,ONLINE education ,INFORMATION & communication technologies ,BLENDED learning ,COVID-19 pandemic ,ELEMENTARY schools ,STUDENTS ,SCHOOL children - Abstract
At the beginning of 2020, the world was experiencing the Covid-19 pandemic, and Indonesia was no exception. The occurrence of this affects the learning system in Indonesia, the learning system that was originally face-to-face was forced to online form, in this case the teachers are required to provide a creative, efficient and optimal learning system for students. So the purpose of this study is to find out the difference in the average learning result of elementary school students during the pandemic. The method used in this study is quantitative with a posttest-only control group design. The population in this study were grade 4 elementary school students in Majalengka district, Indonesia. There were 64 samples and was taken by purposive sampling. The results of this study are that there are differences in the average student learning results where students who study with the hybrid learning system are higher than the face-to-face learning system. The hybrid learning system is very reliable in the 4.0 era as well as learning during the Covid-19 pandemic. However, for the record, it is necessary to look at the facilities and infrastructure considering that this system relies on technology, it is necessary to understand and be able to control the learning media for both teachers and students so that learning outcomes can be optimal and minimize the occurrence of obstacles. The present study revealed the implementation of 21st century learning. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
188. Multirotors From Takeoff to Real-Time Full Identification Using the Modified Relay Feedback Test and Deep Neural Networks.
- Author
-
Ayyad, Abdulla, Chehadeh, Mohamad, Silva, Pedro Henrique, Wahbah, Mohamad, Hay, Oussama Abdul, Boiko, Igor, and Zweiri, Yahya
- Subjects
ARTIFICIAL neural networks ,DRONE aircraft ,PSYCHOLOGICAL feedback ,VERTICAL jump ,SLIDING mode control ,DEEP learning - Abstract
Low-cost real-time identification of multirotor unmanned aerial vehicle (UAV) dynamics is an active area of research supported by the surge in demand and emerging application domains. Such real-time identification capabilities shorten development time and cost, making UAVs’ technology more accessible, and enable a wide variety of advanced applications. In this article, we present a novel comprehensive approach, called DNN-MRFT, for real-time identification and tuning of multirotor UAVs using the modified relay feedback test (MRFT) and deep neural networks (DNNs). The main contribution is the development of a generalized framework for the application of DNN-MRFT to higher order systems. One of the notable advantages of DNN-MRFT is the exact estimation of identified process gain, which mitigates the inaccuracies introduced due to the use of the describing function method in approximating the response of Lure’s systems. A secondary contribution is a generalized controller based on DNN-MRFT that takes off a UAV with unknown dynamics and identifies the inner loops dynamics in-flight. Using the developed framework, DNN-MRFT is sequentially applied to the outer translational loops of the UAV utilizing in-flight results obtained for the inner attitude loops. DNN-MRFT takes on average 15 s to get the full knowledge of multirotor UAV dynamics, and without any further tuning or calibration, the UAV would be able to pass through a vertical window and accurately follow trajectories achieving state-of-the-art performance. Such demonstrated accuracy, speed, and robustness of identification pushes the limits of state of the art in real-time identification of UAVs. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
189. Improving Video Temporal Consistency via Broad Learning System.
- Author
-
Sheng, Bin, Li, Ping, Ali, Riaz, and Chen, C. L. Philip
- Abstract
Applying image-based processing methods to original videos on a framewise level breaks the temporal consistency between consecutive frames. Traditional video temporal consistency methods reconstruct an original frame containing flickers from corresponding nonflickering frames, but the inaccurate correspondence realized by optical flow restricts their practical use. In this article, we propose a temporally broad learning system (TBLS), an approach that enforces temporal consistency between frames. We establish the TBLS as a flat network comprising the input data, consisting of an original frame in an original video, a corresponding frame in the temporally inconsistent video on which the image-based technique was applied, and an output frame of the last original frame, as mapped features in feature nodes. Then, we refine extracted features by enhancing the mapped features as enhancement nodes with randomly generated weights. We then connect all extracted features to the output layer with a target weight vector. With the target weight vector, we can minimize the temporal information loss between consecutive frames and the video fidelity loss in the output videos. Finally, we remove the temporal inconsistency in the processed video and output a temporally consistent video. Besides, we propose an alternative incremental learning algorithm based on the increment of the mapped feature nodes, enhancement nodes, or input data to improve learning accuracy by a broad expansion. We demonstrate the superiority of our proposed TBLS by conducting extensive experiments. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
190. Phase Space Reconstruction Driven Spatio-Temporal Feature Learning for Dynamic Facial Expression Recognition.
- Author
-
Wang, Shanmin, Shuai, Hui, and Liu, Qingshan
- Abstract
Automatic Dynamic Facial Expression Recognition (DFER) is a challenging task, since how to effectively capture facial temporal dynamics is still an open problem. In this article, we regard variations of facial expressions as a dynamic system in accord with certain rules, and try to explore the fundamental temporal properties for recognizing dynamic expressions. Inspired by the phase space reconstruction method for time series analysis, we propose a novel network named Phase Space Reconstruction Network (PSRNet) for learning spatio-temporal features of facial expressions. First, 3D convolutional neural networks are used to extract spatial and short-term temporal features, which indicate the state of each frame and are termed as observations in the phase space. All the observations compose the trajectory of the dynamical system. Then, a data-driven across-correlation matrix is inferred to reveal the relationship of the observations. With this matrix, the phase space reconstruction module reconstructs the trajectory by aggregating the observations adaptively in the phase space. Reconstructed observations represent the gradual process of dynamic facial expressions, which is beneficial to recognize these expressions. The experiment results on three databases (Oulu, MMI, and CK+) demonstrate that the proposed PSRNet can extract more informative and representative spatio-temporal features for DFER. Moreover, the visualization of intermediate features reveals that the reconstructed features have global consistency in facial regions and the underlying evolutionary pattern of dynamic facial expression. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
191. Facial Action Unit Detection Using Attention and Relation Learning.
- Author
-
Shao, Zhiwen, Liu, Zhilei, Cai, Jianfei, Wu, Yunsheng, and Ma, Lizhuang
- Abstract
Attention mechanism has recently attracted increasing attentions in the field of facial action unit (AU) detection. By finding the region of interest of each AU with the attention mechanism, AU-related local features can be captured. Most of the existing attention based AU detection works use prior knowledge to predefine fixed attentions or refine the predefined attentions within a small range, which limits their capacity to model various AUs. In this paper, we propose an end-to-end deep learning based attention and relation learning framework for AU detection with only AU labels, which has not been explored before. In particular, multi-scale features shared by each AU are learned first, and then both channel-wise and spatial attentions are adaptively learned to select and extract AU-related local features. Moreover, pixel-level relations for AUs are further captured to refine spatial attentions so as to extract more relevant local features. Without changing the network architecture, our framework can be easily extended for AU intensity estimation. Extensive experiments show that our framework (i) soundly outperforms the state-of-the-art methods for both AU detection and AU intensity estimation on the challenging BP4D, DISFA, FERA 2015, and BP4D+ benchmarks, (ii) can adaptively capture the correlated regions of each AU, and (iii) also works well under severe occlusions and large poses. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
192. Learning Compact Multirepresentation Feature Descriptor for Finger-Vein Recognition.
- Author
-
Li, Shuyi, Ma, Ruijun, Fei, Lunke, and Zhang, Bob
- Abstract
Due to its high anti-counterfeiting and universality, the use of finger-vein pattern for identity authentication has recently attracted extensive attention in academia and industry. Despite recent advances in finger-vein recognition, most of the hand-crafted descriptors require strong prior knowledge, which may be ineffective in expressing its distinctiveness. In this paper, we present a novel compact multi-representation feature descriptor (CMrFD) with visual and semantic consistency, for finger-vein feature representation. Given the finger-vein images, we first form two-view representations to describe the informative vein features in local patches. Then, we jointly learn a feature transformation to map the two-view representations into discriminative binary codes. For the projection function, we linearly combine multi-view information and minimize the quantization error between the projected binary features and the original real-valued features. In terms of visual consistency, we minimize the Euclidean distance of each representation from the same class, at the same time, maximize the Euclidean distance from different classes in the projected space. Semantic consistency is used to ensure that similar images have compact multi-representation combined projection features. Lastly, we calculate the block-wise histograms as the final extracted features for finger-vein recognition. Experimental results on four widely used finger-vein databases demonstrate that the proposed method outperforms the state-of-the-art finger-vein recognition methods. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
193. Heterogeneous Face Recognition via Face Synthesis With Identity-Attribute Disentanglement.
- Author
-
Yang, Ziming, Liang, Jian, Fu, Chaoyou, Luo, Mandi, and Zhang, Xiao-Yu
- Abstract
Heterogeneous Face Recognition (HFR) aims to match faces across different domains (e.g., visible to near-infrared images), which has been widely applied in authentication and forensics scenarios. However, HFR is a challenging problem because of the large cross-domain discrepancy, limited heterogeneous data pairs, and large variation of facial attributes. To address these challenges, we propose a new HFR method from the perspective of heterogeneous data augmentation, named Face Synthesis with Identity-Attribute Disentanglement (FSIAD). Firstly, the identity-attribute disentanglement (IAD) decouples face images into identity-related representations and identity-unrelated representations (called attributes), and then decreases the correlation between identities and attributes. Secondly, we devise a face synthesis module (FSM) to generate a large number of images with stochastic combinations of disentangled identities and attributes for enriching the attribute diversity of synthetic images. Both the original images and the synthetic ones are utilized to train the HFR network for tackling the challenges and improving the performance of HFR. Extensive experiments on five HFR databases validate that FSIAD obtains superior performance than previous HFR approaches. Particularly, FSIAD obtains 4.8% improvement over state of the art in terms of VR@FAR=0.01% on LAMP-HQ, the largest HFR database so far. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
194. GRACGE: Graph Signal Clustering and Multiple Graph Estimation.
- Author
-
Yuan, Yanli, Soh, De Wen, Yang, Xiao, Guo, Kun, and Quek, Tony Q. S.
- Subjects
- *
STATISTICAL accuracy , *REGULARIZATION parameter , *SIGNAL processing , *EXPECTATION-maximization algorithms , *STATISTICS , *CHARTS, diagrams, etc. - Abstract
In graph signal processing (GSP), complex datasets arise from several underlying graphs and in the presence of heterogeneity. Graph learning from heterogeneous graph signals often results in challenging high-dimensional multiple graph estimation problems, and prior information regarding which graph the data was observed is typically unknown. To address the above challenges, we develop a novel framework called GRACGE (GRAph signal Clustering and multiple Graph Estimation) to partition the graph signals into clusters and jointly learn the multiple underlying graphs for each of the clusters. GRACGE advocates a regularized EM (rEM) algorithm where a structure fusion penalty with adaptive regularization parameters is imposed on the M-step. Such a penalty can exploit the structural similarities among graphs to overcome the curse of dimensionality. Moreover, we provide a non-asymptotic bound on the estimation error of the GRACGE algorithm, which establishes its computational and statistical guarantees. Furthermore, this theoretical analysis motivates us to adaptively re-weight the regularization parameters. With the adaptive regularization scheme, the final estimates of GRACGE will geometrically converge to the true parameters within statistical precision. Finally, experimental results on both synthetic and real data demonstrate the performance of the proposed GRACGE algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
195. Practices and Infrastructures for Machine Learning Systems: An Interview Study in Finnish Organizations.
- Author
-
Muiruri, Dennis, Lwakatare, Lucy Ellen, Nurminen, Jukka K., and Mikkonen, Tommi
- Subjects
- *
INSTRUCTIONAL systems , *ARTIFICIAL intelligence , *MACHINE learning , *SOFTWARE engineering , *SOFTWARE engineers - Abstract
Using interviews, we investigated the practices and toolchains for machine learning (ML)-enabled systems from 16 organizations across various domains in Finland. We observed some well-established artificial intelligence engineering approaches, but practices and tools are still needed for the testing and monitoring of ML-enabled systems. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
196. SwinSTFM: Remote Sensing Spatiotemporal Fusion Using Swin Transformer.
- Author
-
Chen, Guanyu, Jiao, Peng, Hu, Qing, Xiao, Linjie, and Ye, Zijian
- Subjects
- *
REMOTE sensing , *IMAGE fusion , *REMOTE-sensing images , *CONVOLUTIONAL neural networks , *FEATURE extraction , *CHINESE medicine , *MARKOV random fields , *BINARY codes - Abstract
Remote sensing images with high temporal and spatial resolutions have broad market demands and various application scenarios. This article aims to generate high-quality remote sensing image time series for feature mining of the growth quality of traditional Chinese medicine. Spatiotemporal fusion is a flexible method that combines two types of satellite images with high temporal resolution or high spatial resolution to generate high-quality remote sensing images. In recent years, many spatiotemporal fusion algorithms have been proposed, and deep learning-based methods show extraordinary talents in this field. However, the current deep learning-based methods have three problems: 1) most algorithms do not support models with large-scale learnable parameters; 2) the model structure based on convolutional neural networks will bring the noise to the image fusion process; and 3) current deep learning-based methods ignore some excellent modules in traditional spatiotemporal fusion algorithms. For the above problems and challenges, this article creatively proposes a new algorithm based on the Swin transformer and the linear spectral mixing theory. The algorithm makes full use of the advantages of the Swin transformer in feature extraction and integrates the unmixing theories into the model based on the self-attention mechanism, which greatly improves the quality of generated images. In the experimental part, the proposed algorithm achieves state-of-the-art results on three well-known public datasets and has been proven effective and reasonable in ablation studies. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
197. Nonlocal Self-Similarity-Based Hyperspectral Remote Sensing Image Denoising With 3-D Convolutional Neural Network.
- Author
-
Wang, Zhicheng, Ng, Michael K., Zhuang, Lina, Gao, Lianru, and Zhang, Bing
- Subjects
- *
CONVOLUTIONAL neural networks , *IMAGE denoising , *REMOTE sensing , *THREE-dimensional imaging , *DEEP learning , *MACHINE learning - Abstract
Recently, deep-learning-based denoising methods for hyperspectral images (HSIs) have been comprehensively studied and achieved impressive performance because they can effectively extract complex and nonlinear image features. Compared with deep-learning-based methods, the nonlocal similarity-based denoising methods are more suitable for images containing edges or regular textures. We propose a powerful HSI denoising method, termed non-local 3-D convolutional neural network (NL-3DCNN), combining traditional machine learning and deep learning techniques. NL-3DCNN exploits the high spectral correlation of an HSI using subspace representation, and the corresponding representation coefficients are termed eigenimages. The high spatial correlation in eigenimages is exploited by grouping nonlocal similar patches, which are denoised by a 3-D convolutional neural network. The numerical and graphical denoising results of the simulated and real data show that the proposed method is superior to the state-of-the-art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
198. Neighborhood Preserving and Weighted Subspace Learning Method for Drift Compensation in Gas Sensor.
- Author
-
Yi, Zhengkun, Shang, Wanfeng, Xu, Tiantian, and Wu, Xinyu
- Subjects
- *
GAS detectors , *NEIGHBORHOODS , *GAUSSIAN distribution , *LEARNING ability - Abstract
This article presents a novel discriminative subspace-learning-based unsupervised domain adaptation (DA) method for the gas sensor drift problem. Many existing subspace learning approaches assume that the gas sensor data follow a certain distribution such as Gaussian, which often does not exist in real-world applications. In this article, we address this issue by proposing a novel discriminative subspace learning method for DA with neighborhood preserving (DANP). We introduce two novel terms, including the intraclass graph term and the interclass graph term, to embed the graphs into DA. Besides, most existing methods ignore the influence of the subspace learning on the classifier design. To tackle this issue, we present a novel classifier design method (DANP+) that incorporates the DA ability of the subspace into the learning of the classifier. The weighting function is introduced to assign different weights to different dimensions of the subspace. We have verified the effectiveness of the proposed methods by conducting experiments on two public gas sensor datasets in comparison with the state-of-the-art DA methods. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
199. DVG-Face: Dual Variational Generation for Heterogeneous Face Recognition.
- Author
-
Fu, Chaoyou, Wu, Xiang, Hu, Yibo, Huang, Huaibo, and He, Ran
- Subjects
- *
DEPERSONALIZATION , *FACE , *IDENTITIES (Mathematics) , *GALLIUM nitride , *PUBLIC safety , *DATA distribution , *FACE perception - Abstract
Heterogeneous face recognition (HFR) refers to matching cross-domain faces and plays a crucial role in public security. Nevertheless, HFR is confronted with challenges from large domain discrepancy and insufficient heterogeneous data. In this paper, we formulate HFR as a dual generation problem, and tackle it via a novel dual variational generation (DVG-Face) framework. Specifically, a dual variational generator is elaborately designed to learn the joint distribution of paired heterogeneous images. However, the small-scale paired heterogeneous training data may limit the identity diversity of sampling. In order to break through the limitation, we propose to integrate abundant identity information of large-scale visible data into the joint distribution. Furthermore, a pairwise identity preserving loss is imposed on the generated paired heterogeneous images to ensure their identity consistency. As a consequence, massive new diverse paired heterogeneous images with the same identity can be generated from noises. The identity consistency and identity diversity properties allow us to employ these generated images to train the HFR network via a contrastive learning mechanism, yielding both domain-invariant and discriminative embedding features. Concretely, the generated paired heterogeneous images are regarded as positive pairs, and the images obtained from different samplings are considered as negative pairs. Our method achieves superior performances over state-of-the-art methods on seven challenging databases belonging to five HFR tasks, including NIR-VIS, Sketch-Photo, Profile-Frontal Photo, Thermal-VIS, and ID-Camera. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
200. Byzantine-Resilient Decentralized Stochastic Gradient Descent.
- Author
-
Guo, Shangwei, Zhang, Tianwei, Yu, Han, Xie, Xiaofei, Ma, Lei, Xiang, Tao, and Liu, Yang
- Subjects
- *
FAULT tolerance (Engineering) , *DEEP learning , *INSTRUCTIONAL systems - Abstract
Decentralized learning has gained great popularity to improve learning efficiency and preserve data privacy. Each computing node makes equal contribution to collaboratively learn a Deep Learning model. The elimination of centralized Parameter Servers (PS) can effectively address many issues such as privacy, performance bottleneck and single-point-failure. However, how to achieve Byzantine Fault Tolerance in decentralized learning systems is rarely explored, although this problem has been extensively studied in centralized systems. In this paper, we present an in-depth study towards the Byzantine resilience of decentralized learning systems with two contributions. First, from the adversarial perspective, we theoretically illustrate that Byzantine attacks are more dangerous and feasible in decentralized learning systems: even one malicious participant can arbitrarily alter the models of other participants by sending carefully crafted updates to its neighbors. Second, from the defense perspective, we propose Ubar, a novel algorithm to enhance decentralized learning with Byzantine Fault Tolerance. Specifically, Ubar provides a Uniform Byzantine-resilient Aggregation Rule for benign nodes to select the useful parameter updates and filter out the malicious ones in each training iteration. It guarantees that each benign node in a decentralized system can train a correct model under very strong Byzantine attacks with an arbitrary number of faulty nodes. We conduct extensive experiments on standard image classification tasks and the results indicate that Ubar can effectively defeat both simple and sophisticated Byzantine attacks with higher performance efficiency than existing solutions. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.