3,426 results on '"Training data"'
Search Results
202. Network Delay Measurement with Machine Learning: From Lab to Real-World Deployment.
- Author
-
Mohammed, Shady A., Shirmohammadi, Shervin, and Alchalabi, Alaa Eddin
- Abstract
Artificial Intelligence (AI) continues to impact all facets of technology including Instrumentation and Measurement (I&M) with much effort spent on developing I&M systems assisted by machine learning (ML), especially deep learning [1]. While these ML-assisted I&M systems show promising results in a lab environment, there is always the question of how well they will perform in the real world. In fact, concerns about the real-world performance of ML is not exclusive to I&M but an inherent property of ML in general, because ML is data driven and its performance will change if the data distribution changes in the real world. In this article, we present a case study of developing in the lab an ML-assisted I&M system, specifically a network delay predictor, and deploying it in the real world, achieving 93% accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
203. Real-Time Prediction System of Train Carriage Load Based on Multi-Stream Fuzzy Learning.
- Author
-
Yu, Hang, Lu, Jie, Liu, Anjin, Wang, Bin, Li, Ruimin, and Zhang, Guangquan
- Abstract
When a train leaves a platform, knowing the carriage load (the number of passengers in each carriage) of this train will support train managers to guide passengers at the next platform to choose carriages to avoid congestion. This capacity has become critical since the onset of the pandemic. However, with the dynamicity of passengers and the speed of trains improved (about 3 minutes travel between stations) as well as the station stop period reduced (60–90 second per station), the real-time prediction is more challenging. This paper presents an intelligent system, which is developed in collaboration with Sydney Trains, for real-time predicting carriage load across a city passenger train network. The system comprises three innovations. First, a fuzzy time-matching method significantly improves prediction accuracy in the uncertain situations and allows noisy historical data to be used for training. Second, the LightGBM model is extended with an incremental learning scheme to make forecasting in real-time possible. Third, a new multi-stream learning strategy that merges data streams with similar concept drift patterns is pioneered to increase the amount of suitable training data while reducing generalization errors. A comprehensive suite of practical tests on real-world datasets demonstrates the merit of these solutions. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
204. Training Data Subset Search With Ensemble Active Learning.
- Author
-
Chitta, Kashyap, Alvarez, Jose M., Haussmann, Elmar, and Farabet, Clement
- Abstract
Deep Neural Networks (DNNs) often rely on vast datasets for training. Given the large size of such datasets, it is conceivable that they contain specific samples that either do not contribute or negatively impact the DNN’s optimization. Modifying the training distribution to exclude such samples could provide an effective solution to improve performance and reduce training time. This paper proposes to scale up ensemble Active Learning (AL) methods to perform acquisition at a large scale (10k to 500k samples at a time). We do this with ensembles of hundreds of models, obtained at a minimal computational cost by reusing intermediate training checkpoints. This allows us to automatically and efficiently perform a training data subset search for large labeled datasets. We observe that our approach obtains favorable subsets of training data, which can be used to train more accurate DNNs than training with the entire dataset. We perform an extensive experimental study of this phenomenon on three image classification benchmarks (CIFAR-10, CIFAR-100, and ImageNet), as well as an internal object detection benchmark for prototyping perception models for autonomous driving. Unlike existing studies, our experiments on object detection are at the scale required for production-ready autonomous driving systems. We provide insights on the impact of different initialization schemes, acquisition functions, and ensemble configurations at this scale. Our results provide strong empirical evidence that optimizing the training data distribution can significantly benefit large-scale vision tasks. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
205. Hierarchical Detection of Network Anomalies : A Self-Supervised Learning Approach.
- Author
-
Kye, Hyoseon, Kim, Miru, and Kwon, Minhae
- Subjects
INTRUSION detection systems (Computer security) ,ANOMALY detection (Computer security) ,SUPERVISED learning ,INTERNET traffic ,OUTLIER detection - Abstract
With the increasing amount of Internet traffic, a significant number of network intrusion events have recently been reported. In this letter, we propose a network intrusion detection system that enables hierarchical detection based on self-supervised learning. The proposed solution consists of multiple stages of detection, including the early detection of extreme outliers, which may cause severe damage to the system. Furthermore, it performs thorough reexaminations using the hidden spaces with specialized anomaly scores, which leads to high detection accuracy. Extensive simulation results confirm that the proposed solution can preemptively detect 20% of abnormal data, thereby enabling a proactive response, and can detect 99% of abnormal data at the final stage. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
206. A Neural Network-Prepended GLRT Framework for Signal Detection Under Nonlinear Distortions.
- Author
-
Sahay, Rajeev, Appadwedula, Swaroop, Love, David J., and Brinton, Christopher G.
- Abstract
Many communications and sensing applications hinge on the detection of a signal in a noisy, interference-heavy environment. Signal processing theory yields techniques such as the generalized likelihood ratio test (GLRT) to perform detection when the received samples correspond to a linear observation model. Numerous practical applications exist, however, where the received signal has passed through a nonlinearity, causing significant performance degradation of the GLRT. In this work, we propose prepending the GLRT detector with a neural network classifier capable of identifying the particular nonlinear time samples in a received signal. We show that pre-processing received nonlinear signals using our trained classifier to eliminate excessively nonlinear samples (i) improves the detection performance of the GLRT on nonlinear signals and (ii) retains the theoretical guarantees provided by the GLRT on linear observation models for accurate signal detection. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
207. Phishing Detection Leveraging Machine Learning and Deep Learning: A Review.
- Author
-
Divakaran, Dinil Mon and Oest, Adam
- Abstract
Phishing attacks trick victims into disclosing sensitive information. To counter them, we explore machine learning and deep learning models leveraging large-scale data. We discuss models built on different kinds of data and present multiple deployment options to detect phishing attacks. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
208. Machine Learning for Source Code Vulnerability Detection: What Works and What Isn’t There Yet.
- Author
-
Marjanov, Tina, Pashchenko, Ivan, and Massacci, Fabio
- Abstract
We review machine learning approaches for detecting (and correcting) vulnerabilities in source code, finding that the biggest challenges ahead involve agreeing to a benchmark, increasing language and error type coverage, and using pipelines that do not flatten the code's structure. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
209. Data Privacy and Trustworthy Machine Learning.
- Author
-
Strobel, Martin and Shokri, Reza
- Abstract
The privacy risks of machine learning models is a major concern when training them on sensitive and personal data. We discuss the tradeoffs between data privacy and the remaining goals of trustworthy machine learning (notably, fairness, robustness, and explainability). [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
210. Machine Learning Target Count Prediction in Electromagnetics Using Neural Networks.
- Author
-
Sabbaghi, Mohsen, Zhang, Jun, and Hanson, George W.
- Subjects
- *
ELECTROMAGNETISM , *CONVOLUTIONAL neural networks , *INVERSE problems , *HIGH frequency antennas , *NETWORK performance , *MACHINE learning - Abstract
In this article, we showcase an application of neural networks (NNs) to solve an inverse problem in electromagnetics (EMs). Wires are randomly distributed into an area of known dimensions. The wires are then illuminated with a monochromatic plane wave (PW) at a certain angle of incidence, and the EM field measured at a finite number of uniformly spaced points along the perimeter of the area is then fed into a convolutional neural network (CNN) designed to predict the number of wires. Counting the wires is posed as a supervised classification problem with a known upper limit to the number of wires, and accuracy of 96% has been achieved for the case where the number of the wires is known to be ten or less. A number of approaches have been taken to improve the network performance including frequency variation analysis and illuminating the wire distributions with additional PW angles of incidence. We conclude with an analysis of the network capability to resolve objects based on its performance on known wire distributions, which suggests the existence of a characteristic resolution limit corresponding to the CNN topology. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
211. Dielectric Breast Phantoms by Generative Adversarial Network.
- Author
-
Shao, Wenyi and Zhou, Beibei
- Subjects
- *
GENERATIVE adversarial networks , *BREAST , *MICROWAVE imaging , *DIELECTRICS , *BREAST imaging , *MACHINE learning - Abstract
In order to conduct the research of machine learning (ML)-based microwave breast imaging (MBI), a large number of digital dielectric breast phantoms that can be used as training data (ground truth) are required but are difficult to be achieved from practice. Although a few dielectric breast phantoms have been developed for research purpose, the number and the diversity are limited and are far inadequate to develop a robust ML algorithm for MBI. This article presents a neural network method to generate 2-D virtual breast phantoms that are similar to the real ones, which can be used to develop ML-based MBI in the future. The generated phantoms are similar but are different from those used in training. Each phantom consists of several images with each representing the distribution of a dielectric parameter in the breast map. A statistical analysis was performed over 10 000 generated phantoms to investigate the performance of the generative network. With the generative network, one may generate an unlimited number of breast images with more variations, so the ML-based MBI will be more ready to deploy. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
212. Gaze Estimation via Modulation-Based Adaptive Network With Auxiliary Self-Learning.
- Author
-
Wu, Yong, Li, Gongyang, Liu, Zhi, Huang, Mengke, and Wang, Yang
- Subjects
- *
EYE tracking , *GAZE , *EYE , *RECOMMENDER systems , *INFORMATION filtering - Abstract
Given a face image, most of previous works in gaze estimation infer the gaze via a well-trained model with supervised training. However, the distribution of test data may be very different compared to that of training data since samples might be corrupted in real-world scenarios (e.g., taking a photo in strong light). This will lead to a gap between source domain (i.e., training data) and target domain (i.e., test data). In this paper, we first introduce self-supervised learning into our method for addressing challenging situations in gaze estimation. Moreover, existing appearance-based gaze estimation methods focus on directing towards the development of powerful regressors, which mainly utilize face and eye images simultaneously or face (eye) images only. However, the problem of inter cues between face and eye features has been largely overlooked. To this end, we propose a novel Modulation-based Adaptive Network (MANet) for gaze estimation, which uses high-level knowledge to filter the distractive information and bridges the intrinsic relationship between face and eye features. Further, we combine self-supervised learning and MANet to learn to adapt to challenging cases, such as abnormal lighting conditions and poor-quality images, by minimizing a self-supervised loss and a supervised loss jointly. The experimental results on several datasets demonstrate the effectiveness of our proposed approach with a real-time speed of 900 fps on a PC with an NVIDIA Titan RTX GPU. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
213. Feature-Based Style Randomization for Domain Generalization.
- Author
-
Wang, Yue, Qi, Lei, Shi, Yinghuan, and Gao, Yang
- Subjects
- *
DATA augmentation , *GENERALIZATION , *GOAL (Psychology) , *FEATURE extraction , *DATABASES - Abstract
As a recent noticeable topic, domain generalization (DG) aims to first learn a generic model on multiple source domains and then directly generalize to an arbitrary unseen target domain without any additional adaption. In previous DG models, by generating virtual data to supplement observed source domains, the data augmentation based methods have shown its effectiveness. To simulate the possible unseen domains, most of them enrich the diversity of original data via image-level style transformation. However, we argue that the potential styles are hard to be exhaustively illustrated and fully augmented due to the limited referred styles, leading the diversity could not be always guaranteed. Unlike image-level augmentation, we in this paper develop a simple yet effective feature-based style randomization module to achieve feature-level augmentation, which can produce random styles via integrating random noise into the original style. Compared with existing image-level augmentation, our feature-level augmentation favors a more goal-oriented and sample-diverse way. Furthermore, to sufficiently explore the efficacy of the proposed module, we design a novel progressive training strategy to enable all parameters of the network to be fully trained. Extensive experiments on three standard benchmark datasets, i.e., PACS, VLCS and Office-Home, highlight the superiority of our method compared to the state-of-the-art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
214. Type-2 Fuzzy Model-Based Movement Primitives for Imitation Learning.
- Author
-
Sun, Da, Liao, Qianfang, and Loutfi, Amy
- Subjects
- *
ROBOT programming , *TASK analysis - Abstract
Imitation learning is an important direction in the area of robot skill learning. It provides a user-friendly and straightforward solution to transfer human demonstrations to robots. In this article, we integrate fuzzy theory into imitation learning to develop a novel method called type-2 fuzzy model-based movement primitives (T2FMP). In this method, a group of data-driven type-2 fuzzy models are used to describe the input–output relationships of demonstrations. Based on the fuzzy models, T2FMP can efficiently reproduce the trajectory without high computational costs or cumbersome parameter settings. Besides, it can well handle the variation of the demonstrations and is robust to noise. In addition, we develop extensions that endow T2FMP with trajectory modulation and superposition to achieve real-time trajectory adaptation to various scenarios. Going beyond existing imitation learning methods, we further extend T2FMP to regulate the trajectory to avoid collisions in the environment that is unstructured, nonconvex, and detected with noisy outliers. Several experiments are performed to validate the effectiveness of our method. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
215. Towards a Weakly Supervised Framework for 3D Point Cloud Object Detection and Annotation.
- Author
-
Meng, Qinghao, Wang, Wenguan, Zhou, Tianfei, Shen, Jianbing, Jia, Yunde, and Van Gool, Luc
- Subjects
- *
OBJECT recognition (Computer vision) , *POINT cloud , *ANNOTATIONS , *ARCHITECTURAL design , *DETECTORS - Abstract
It is quite laborious and costly to manually label LiDAR point cloud data for training high-quality 3D object detectors. This work proposes a weakly supervised framework which allows learning 3D detection from a few weakly annotated examples. This is achieved by a two-stage architecture design. Stage-1 learns to generate cylindrical object proposals under inaccurate and inexact supervision, obtained by our proposed BEV center-click annotation strategy, where only the horizontal object centers are click-annotated in bird's view scenes. Stage-2 learns to predict cuboids and confidence scores in a coarse-to-fine, cascade manner, under incomplete supervision, i.e., only a small portion of object cuboids are precisely annotated. With KITTI dataset, using only 500 weakly annotated scenes and 534 precisely labeled vehicle instances, our method achieves $86-97$ 86 - 97 percent the performance of current top-leading, fully supervised detectors (which require 3,712 exhaustively annotated scenes with 15,654 instances). More importantly, with our elaborately designed network architecture, our trained model can be applied as a 3D object annotator, supporting both automatic and active (human-in-the-loop) working modes. The annotations generated by our model can be used to train 3D object detectors, achieving over 95 percent of their original performance (with manually labeled training data). Our experiments also show our model's potential in boosting performance when given more training data. The above designs make our approach highly practical and open-up opportunities for learning 3D detection at reduced annotation cost. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
216. Variational HyperAdam: A Meta-Learning Approach to Network Training.
- Author
-
Wang, Shipeng, Yang, Yan, Sun, Jian, and Xu, Zongben
- Subjects
- *
MULTILAYER perceptrons , *ARTIFICIAL neural networks , *RANDOM variables , *NETWORK performance , *MARKOV chain Monte Carlo , *MATHEMATICAL optimization - Abstract
Stochastic optimization algorithms have been popular for training deep neural networks. Recently, there emerges a new approach of learning-based optimizer, which has achieved promising performance for training neural networks. However, these black-box learning-based optimizers do not fully take advantage of the experience in human-designed optimizers and heavily rely on learning from meta-training tasks, therefore have limited generalization ability. In this paper, we propose a novel optimizer, dubbed as Variational HyperAdam, which is based on a parametric generalized Adam algorithm, i.e., HyperAdam, in a variational framework. With Variational HyperAdam as optimizer for training neural network, the parameter update vector of the neural network at each training step is considered as random variable, whose approximate posterior distribution given the training data and current network parameter vector is predicted by Variational HyperAdam. The parameter update vector for network training is sampled from this approximate posterior distribution. Specifically, in Variational HyperAdam, we design a learnable generalized Adam algorithm for estimating expectation, paired with a VarBlock for estimating the variance of the approximate posterior distribution of parameter update vector. The Variational HyperAdam is learned in a meta-learning approach with meta-training loss derived by variational inference. Experiments verify that the learned Variational HyperAdam achieved state-of-the-art network training performance for various types of networks on different datasets, such as multilayer perceptron, CNN, LSTM and ResNet. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
217. Two-Path Aggregation Attention Network With Quad-Patch Data Augmentation for Few-Shot Scene Classification.
- Author
-
Gong, Maoguo, Li, Jianzhao, Zhang, Yourun, Wu, Yue, and Zhang, Mingyang
- Subjects
- *
DATA augmentation , *REMOTE sensing , *FEATURE extraction , *ARTIFICIAL neural networks , *CLASSIFICATION , *TECHNOLOGICAL innovations - Abstract
The few-shot scene classification is dedicated to identifying unseen remote sensing classes when only a very small number of labeled samples are available for reference. Most of the existing few-shot scene classification methods are based on meta-learning and use the episodic learning for training, which lacks the consideration for the utilization of data efficiency. In this article, instead of designing sophisticated meta-learning-based algorithms, we are committed to training a feature extractor with good generalization performance and strong feature extraction capability. Specifically, we propose a novel two-path aggregation attention network with quad-patch data augmentation, called data architecture network (DANet), to solve the problem of few-shot scene classification from both data and architecture aspects. In terms of data, we design a new data augmentation strategy named quad-patch augmentation. We use the characteristics of remote sensing images to chunk and reassemble any existing data, thereby generating pseudo-new data to enrich the training set. In terms of architecture, we present a two-path aggregation attention module that makes it easier for the model to focus on the key clues in a targeted manner. The comparative experiments in natural image datasets and remote sensing image datasets demonstrate the effectiveness of our two innovations. In addition, DANet achieves competitive or state-of-the-art (SOTA) results on three benchmark scene classification datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
218. Deep Learning Seismic Inversion Based on Prestack Waveform Datasets.
- Author
-
Zhang, Jian, Sun, Hui, Zhang, Gan, and Zhao, Xiaoyan
- Subjects
- *
DEEP learning , *THEORY of wave motion , *INVERSE problems , *ANALYTICAL solutions , *TRAINING needs , *TRANSMISSION of sound - Abstract
Prediction of elastic parameters (e.g., P-, S-wave velocity, and density) from observed seismic data is one of the most common means of reservoir characterization. Recently, deep learning (DL), as a data-driven approach, has been attracting increasing interest in seismic inversion. DL is proven to have the potential to learn complex systems and solve inverse problems efficiently. One of the most key components of DL is the training dataset, and an effective training dataset is a prerequisite for the success of DL-based methods. In seismic inversion, the training dataset needs to be artificially expanded due to the limited number of actual training data pairs. Traditional approaches of using the exact Zoeppritz equation (EZE) or its approximations for training dataset construction have limitations, principally, the single interface assumption and the neglect of wave propagation effects. Alternatively, the analytical solution of the 1-D wave equation (i.e., reflectivity method [RM]) can simulate the full wave, including transmission losses and internal multiples, and can be executed in a target-oriented manner. Inspired by this, we develop a data-driven elastic parameter prediction method based on waveform formulation. The method uses RM to construct training dataset, which both compensates for the inadequate training dataset in data-driven seismic inversion and improves the accuracy of the inversion results. We implement the method in a synthetic model as well as field data. The results are compared with model-driven methods (EZE and RM) and data-driven method based on EZE, and it is shown that the proposed method outperforms these three methods. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
219. Interpretability-Guided Defense Against Backdoor Attacks to Deep Neural Networks.
- Author
-
Jiang, Wei, Wen, Xiangyu, Zhan, Jinyu, Wang, Xupeng, and Song, Ziwei
- Subjects
- *
ARTIFICIAL neural networks , *DISTRIBUTION (Probability theory) - Abstract
As an emerging threat to deep neural networks (DNNs), backdoor attacks have received increasing attentions due to the challenges posed by the lack of transparency inherent in DNNs. In this article, we develop an efficient algorithm from the interpretability of DNNs to defend against backdoor attacks to DNN models. To extract critical neurons, we deploy sets of control gates following neurons in layers, and the function of a DNN model can be interpreted as semantic sensitivities of neurons to input samples. A backdoor identification approach, derived from the activation frequency distribution on critical neurons, is proposed to reveal anomalies of particular neurons produced by backdoor attacks. Subsequently, a feasible and fine-grained pruning strategy is introduced to eliminate backdoors hidden in DNN models, without the need of retraining. Extensive experiments demonstrate that the proposed algorithm can identify and eliminate malicious backdoors efficiently in both single-target and multitarget scenarios with the performance of a DNN model retained to a large extent. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
220. Attribute Graph Neural Networks for Strict Cold Start Recommendation.
- Author
-
Qian, Tieyun, Liang, Yile, Li, Qing, and Xiong, Hui
- Subjects
- *
RECOMMENDER systems , *MATRIX decomposition , *DEEP learning , *LOGIC circuits , *GRAPH algorithms , *SOCIAL networks , *FACTOR structure - Abstract
Rating prediction is a classic problem underlying recommender systems. It is traditionally tackled with matrix factorization. Recently, deep learning based methods, especially graph neural networks, have made impressive progress on this problem. Despite their effectiveness, existing methods focus on modeling the user-item interaction graph. The inherent drawback of such methods is that their performance is bound to the density of the interactions, which is however usually of high sparsity. More importantly, for a strict cold start user/item that neither appears in the training data nor has any interactions in the test stage, such methods are unable to learn the preference embedding of the user/item since there is no link to this user/item in the graph. In this work, we develop a novel framework Attribute Graph Neural Networks (AGNN) by exploiting the attribute graph rather than the commonly used interaction graph. This leads to the capability of learning embeddings for the strict cold start users/items. Our AGNN can produce the preference embedding for a strict cold user/item by learning on the distribution of attributes with an extended variational auto-encoder (eVAE) structure. Moreover, we propose a new graph neural network variant, i.e., gated-GNN, to effectively aggregate various attributes of different modalities in a neighborhood. Empirical results on three real-world datasets demonstrate that our model yields significant improvements for strict cold start recommendations and outperforms or matches the state-of-the-art performance in the warm start scenario. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
221. X -Secure T -Private Federated Submodel Learning With Elastic Dropout Resilience.
- Author
-
Jia, Zhuqing and Jafar, Syed Ali
- Subjects
- *
DISTRIBUTED databases , *INFORMATION retrieval , *MATRIX multiplications - Abstract
Motivated by recent interest in federated submodel learning, this work explores the fundamental problem of privately reading from and writing to a database comprised of $K$ files (submodels) that are stored across $N$ distributed servers according to an $X$ -secure threshold secret sharing scheme. One after another, various users wish to retrieve their desired file, locally process the information and then update the file in the distributed database while keeping the identity of their desired file private from any set of up to $T$ colluding servers. The availability of servers changes over time, so elastic dropout resilience is required. The main contribution of this work is an adaptive scheme, called ACSA-RW, that takes advantage of all currently available servers to reduce its communication costs, fully updates the database after each write operation even though the database is only partially accessible due to server dropouts, and ensures a memoryless operation of the network in the sense that the storage structure is preserved and future users may remain oblivious of the past history of server dropouts. The ACSA-RW construction builds upon cross-subspace alignment (CSA) codes that were originally introduced for $X$ -secure $T$ -private information retrieval and have been shown to be natural solutions for secure distributed batch matrix multiplication problems. ACSA-RW achieves the desired private read and write functionality with elastic dropout resilience, matches the best results for private-read from PIR literature, improves significantly upon available baselines for private-write, reveals a striking symmetry between upload and download costs, and exploits storage redundancy to accommodate arbitrary read and write dropout servers up to certain threshold values. It also answers in the affirmative an open question by Kairouz et al. for the case of partially colluding servers (i.e., tolerating collusion up to a threshold) by exploiting synergistic gains from the joint design of private read and write operations. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
222. Transfer Learning for Wireless Networks: A Comprehensive Survey.
- Author
-
Nguyen, Cong T., Van Huynh, Nguyen, Chu, Nam H., Saputra, Yuris Mulya, Hoang, Dinh Thai, Nguyen, Diep N., Pham, Quoc-Viet, Niyato, Dusit, Dutkiewicz, Eryk, and Hwang, Won-Joo
- Subjects
HUMAN activity recognition ,SPECTRUM allocation ,NEXT generation networks ,MACHINE learning ,WIRELESS sensor networks - Abstract
With outstanding features, machine learning (ML) has become the backbone of numerous applications in wireless networks. However, the conventional ML approaches face many challenges in practical implementation, such as the lack of labeled data, the constantly changing wireless environments, the long training process, and the limited capacity of wireless devices. These challenges, if not addressed, can impede the effectiveness and applicability of ML in wireless networks. To address these problems, transfer learning (TL) has recently emerged to be a promising solution. The core idea of TL is to leverage and synthesize distilled knowledge from similar tasks and valuable experiences accumulated from the past to facilitate the learning of new problems. By doing so, TL techniques can reduce the dependence on labeled data, improve the learning speed, and enhance the ML methods’ robustness to different wireless environments. This article aims to provide a comprehensive survey on the applications of TL in wireless networks. Particularly, we first provide an overview of TL, including formal definitions, classification, and various types of TL techniques. We then discuss diverse TL approaches proposed to address emerging issues in wireless networks. The issues include spectrum management, signal recognition, security, caching, localization, and human activity recognition, which are all important to next-generation networks, such as 5G and beyond. Finally, we highlight important challenges, open issues, and future research directions of TL in future wireless networks. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
223. Distributed Semisupervised Fuzzy Regression With Interpolation Consistency Regularization.
- Author
-
Shi, Ye, Zhang, Leijie, Cao, Zehong, Tanveer, Mohammad, and Lin, Chin-Teng
- Subjects
INTERPOLATION ,FUZZY neural networks ,BEES algorithm - Abstract
Recently, distributed semisupervised learning (DSSL) algorithms have shown their effectiveness in leveraging unlabeled samples over interconnected networks, where agents cannot share their original data with each other and can only communicate nonsensitive information with their neighbors. However, existing DSSL algorithms cannot cope with data uncertainties and may suffer from high computation and communication overhead problems. To handle these issues, we propose a distributed semisupervised fuzzy regression (DSFR) model with fuzzy if-then rules and interpolation consistency regularization (ICR). The ICR, which was proposed recently for semisupervised problem, can force decision boundaries to pass through sparse data areas, thus increasing model robustness. However, its application in distributed scenarios has not been considered yet. In this work, we proposed a distributed fuzzy C-means (DFCM) method and a distributed interpolation consistency regularization (DICR) built on the well-known alternating direction method of multipliers to respectively locate parameters in antecedent and consequent components of DSFR. Notably, the DSFR model converges very fast since it does not involve back-propagation procedure and is scalable to large-scale datasets benefiting from the utilization of DFCM and DICR. Experiments results on both artificial and real-world datasets show that the proposed DSFR model can achieve much better performance than the state-of-the-art DSSL algorithm in terms of both loss value and computational cost. Our code is available online.1 [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
224. Vehicular Trajectory Classification and Traffic Anomaly Detection in Videos Using a Hybrid CNN-VAE Architecture.
- Author
-
Kumaran Santhosh, Kelathodi, Dogra, Debi Prosad, Roy, Partha Pratim, and Mitra, Adway
- Abstract
Visual surveillance has become indispensable in the evolution of Intelligent Transportation Systems (ITS). Video object trajectories are key to many of the visual surveillance applications. Classifying varying length time series data such as video object trajectories using conventional neural networks, can be challenging. In this paper, we propose trajectory classification and anomaly detection using a hybrid Convolutional Neural Network (CNN) and Variational Autoencoder (VAE) architecture. First, we introduce a high level features for varying length object trajectories using color gradient representation. In the next stage, a semi-supervised way to annotate moving object trajectories extracted using Temporally Incremental Gravitational Model (TIGM) is used for class labeling. For training, anomalous trajectories are identified using t-Distributed Stochastic Neighbor Embedding (t-SNE). Finally, a hybrid CNN-VAE architecture has been proposed for trajectory classification and anomaly detection. The results obtained using publicly available surveillance video datasets reveal that the proposed method can successfully identify traffic anomalies such as violations in lane driving, sudden speed variations, abrupt termination of vehicle during movement, and vehicles moving in wrong directions. The accuracy of trajectory classification improves by a margin of 1-6% against popular neural networks-based classifiers across various datasets using the proposed high-level features. The gradient representation also improves the anomaly detection accuracy significantly (30-35%). Code and dataset can be found at https://github.com/santhoshkelathodi/CNN-VAE. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
225. Searching Towards Class-Aware Generators for Conditional Generative Adversarial Networks.
- Author
-
Zhou, Peng, Xie, Lingxi, Ni, Bingbing, and Tian, Qi
- Subjects
GENERATIVE adversarial networks ,MARKOV processes ,DATA reduction ,COMPUTER architecture ,MOVING average process - Abstract
Conditional generative adversarial networks (cGANs) are designed to generate images based on the provided conditions, e.g., class-level distributions, semantic label maps, etc. Existing methods have used the same generator architecture for all classes. This paper presents an idea that adopts neural architecture search (NAS) to find a class-aware architecture for each class. The search space contains regular and class-modulated convolutions, where the latter is designed to introduce class-specific information while avoiding the reduction of training data for each class generator. The search algorithm follows a weight-sharing pipeline with mixed-architecture optimization so that the search cost does not grow with the number of classes. To learn the sampling policy, a Markov decision process is embedded into the search algorithm, and a moving average is applied for better stability. Class-aware generators show advantages over class-agnostic architectures experimentally. Moreover, we discover two intriguing phenomena that are inspirational to craft cGANs by hand. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
226. Spelling-Aware Word-Based End-to-End ASR.
- Author
-
Egorova, Ekaterina, Vydana, Hari Krishna, Burget, Lukas, and Cernocky, Jan Honza
- Subjects
AUTOMATIC speech recognition ,LIPREADING ,RECURRENT neural networks ,ERROR rates - Abstract
We propose a new end-to-end architecture for automatic speech recognition that expands the “listen, attend and spell” (LAS) paradigm. While the main word-predicting network is trained to predict words, the secondary, speller network, is optimized to predict word spellings from inner representations of the main network (e.g. word embeddings or context vectors from the attention module). We show that this joint training improves the word error rate of a word-based system and enables solving additional tasks, such as out-of-vocabulary word detection and recovery. The tests are conducted on LibriSpeech dataset consisting of 1000 h of read speech. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
227. Variational Few-Shot Learning for Microservice-Oriented Intrusion Detection in Distributed Industrial IoT.
- Author
-
Liang, Wei, Hu, Yiyong, Zhou, Xiaokang, Pan, Yi, and I-Kai Wang, Kevin
- Abstract
Along with the popularity of the Internet of Things (IoT) techniques with several computational paradigms, such as cloud and edge computing, microservice has been viewed as a promising architecture in large-scale application design and deployment. Due to the limited computing ability of edge devices in distributed IoT, only a small scale of data can be used for model training. In addition, most of the machine-learning-based intrusion detection methods are insufficient when dealing with imbalanced dataset under limited computing resources. In this article, we propose an optimized intra/inter-class-structure-based variational few-shot learning (OICS-VFSL) model to overcome a specific out-of-distribution problem in imbalanced learning, and to improve the microservice-oriented intrusion detection in distributed IoT systems. Following a newly designed VFSL framework, an intra/inter-class optimization scheme is developed using reconstructed feature embeddings, in which the intra-class distance is optimized based on the approximation during a variation Bayesian process, while the inter-class distance is optimized based on the maximization of similarities during a feature concatenation process. An intelligent intrusion detection algorithm is, then, introduced to improve the multiclass classification via a nonlinear neural network. Evaluation experiments are conducted using two public datasets to demonstrate the effectiveness of our proposed model, especially in detecting novel attacks with extremely imbalanced data, compared with four baseline methods. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
228. Semisupervised Feature Selection With Sparse Discriminative Least Squares Regression.
- Author
-
Wang, Chen, Chen, Xiaojun, Yuan, Guowen, Nie, Feiping, and Yang, Min
- Abstract
In big data time, selecting informative features has become an urgent need. However, due to the huge cost of obtaining enough labeled data for supervised tasks, researchers have turned their attention to semisupervised learning, which exploits both labeled and unlabeled data. In this article, we propose a sparse discriminative semisupervised feature selection (SDSSFS) method. In this method, the $\epsilon $ -dragging technique for the supervised task is extended to the semisupervised task, which is used to enlarge the distance between classes in order to obtain a discriminative solution. The flexible $\ell _{2,p}$ norm is implicitly used as regularization in the new model. Therefore, we can obtain a more sparse solution by setting smaller $p$. An iterative method is proposed to simultaneously learn the regression coefficients and $\epsilon $ -dragging matrix and predicting the unknown class labels. Experimental results on ten real-world datasets show the superiority of our proposed method. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
229. A New Belief-Based Bidirectional Transfer Classification Method.
- Author
-
Liu, Zhun-Ga, Qiu, Guang-Hui, Wang, Shu-Yue, Li, Tian-Cheng, and Pan, Quan
- Abstract
In pattern classification, we may have a few labeled data points in the target domain, but a number of labeled samples are available in another related domain (called the source domain). Transfer learning can solve such classification problems via the knowledge transfer from source to target domains. The source and target domains can be represented by heterogeneous features. There may exist uncertainty in domain transformation, and such uncertainty is not good for classification. The effective management of uncertainty is important for improving classification accuracy. So, a new belief-based bidirectional transfer classification (BDTC) method is proposed. In BDTC, the intraclass transformation matrix is estimated at first for mapping the patterns from source to target domains, and this matrix can be learned using the labeled patterns of the same class represented by heterogeneous domains (features). The labeled patterns in the source domain are transferred to the target domain by the corresponding transformation matrix. Then, we learn a classifier using all the labeled patterns in the target domain to classify the objects. In order to take full advantage of the complementary knowledge of different domains, we transfer the query patterns from target to source domains using the K-NN technique and do the classification task in the source domain. Thus, two pieces of classification results can be obtained for each query pattern in the source and target domains, but the classification results may have different reliabilities/weights. A weighted combination rule is developed to combine the two classification results based on the belief functions theory, which is an expert at dealing with uncertain information. We can efficiently reduce the uncertainty of transfer classification via the combination strategy. Experiments on some domain adaptation benchmarks show that our method can effectively improve classification accuracy compared with other related methods. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
230. From Publication to Production : Interactive Deployment of Forklift Activity Recognition
- Author
-
Chen, Kunru, Klang, Jonas, Zeitler, Erik, Chen, Kunru, Klang, Jonas, and Zeitler, Erik
- Abstract
As the rise of the Internet of Things has made a vast amount of sensory data readily available, research that develops data-driven methods for industrial applications has become increasingly popular. Yet, there are not many reports presenting the deployment of these methods. One can always expect “there is a gap between theory and reality,” but then, what is the gap? How big is it, and how to handle it? This paper demonstrates the deployment of machine learning (ML) models on a real forklift truck and the utilization of an interactive method that essentially bridges the gap between laboratory and realistic settings of the forklift application. The interactive method suggests a gradual adaptation to various user cases in practice: to test the offline method in an environment slightly different from what the training data presents and adjust the method according to these new usages. Additionally, the interactive model deployment allows modification of the offline method in the telematics unit of the forklift truck, which enables an immediate validation of the method adjustment. The result shows that the proposed method can effectively revise erroneous predictions from the ML method and provide quick adaptation to different forklift operations. It also gives a positive signal for further large-scale deployment of offline ML methods and shows their potential to create value and provide optimization in the industry. © 2024 IEEE.
- Published
- 2024
- Full Text
- View/download PDF
231. Synthetic Data: Methods, Use Cases, and Risks
- Author
-
De Cristofaro, Emiliano, De Cristofaro, Emiliano, De Cristofaro, Emiliano, and De Cristofaro, Emiliano
- Published
- 2024
232. Generation of Synthetic Clinical Trial Subject Data Using Generative Adversarial Networks
- Author
-
Lindell, Linus and Lindell, Linus
- Abstract
The development of new solutions incorporating artificial intelligence (AI) within the medical field is an area of great interest. However, access to comprehensive and diverse datasets is restricted due to the sensitive nature of the data. A potential solution to this is to generatesynthetic datasets based on real medical data. Synthetic data could protect the integrity of the subjects while preserving the inherent information necessary for training AI models and be generated in greater quantity than otherwise available. This thesis project aims to generate reliable clinical trial subject data using a generative adversarial network (GAN). The main data set used is a mock clinical trial dataset consisting of multiple subject visits, however an additional data set containing authentic medical data is also used for better insights into the model’s ability to learn underlying relationships. The thesis also investigates training strategies for simulating the temporal dimension and the missing values in the data. The GAN model used is an altered version of the Conditional Tabular GAN (CTGAN)made to be compatible with the preprocessed clinical trial mock data, and multiple model architectures and number of training epochs are examined. The results show great potential for GAN models on clinical trial datasets, especially for real-life data. One model, trained on the authentic dataset, generates near-perfect synthetic data with respect to column distributions and correlation between columns. The results also show that classification models trained on synthetic data and tested on real data have the potential to match the performance of classification models trained on real data. While the synthetic data replicates the missing values, no definitive conclusion can be drawn regarding the temporal characteristics due to the sparsity of the mock dataset and lack of real correlations in it. Although the results are promising, further experiments on authentic datasets with less spa
- Published
- 2024
233. Der Effizienz- und Intelligenzbegriff in der Lexikographie und künstlichen Intelligenz: kann ChatGPT die lexikographische Textsorte nachbilden?
- Author
-
Universidade de Santiago de Compostela. Departamento de Filoloxía Inglesa e Alemá, Arias Arias, Iván, Domínguez Vázquez, María José, Valcárcel Riveiro, Carlos, Universidade de Santiago de Compostela. Departamento de Filoloxía Inglesa e Alemá, Arias Arias, Iván, Domínguez Vázquez, María José, and Valcárcel Riveiro, Carlos
- Abstract
Mittels Pilotexperimente für das Sprachenpaar Deutsch–Galicisch untersucht der vorliegende Aufsatz den Effizienz- und Intelligenzbegriff in der Lexikographie und künstlichen Intelligenz (KI). Die Experimente versuchen, empirisch und statistisch fundierte Erkenntnisse über die lexikographische Textsorte „Wörterbuchartikel” in den Antworten von ChatGPT-3.5 zu gewinnen, und darüber hinaus über die lexikographischen Daten, mit denen dieser Chatbot trainiert wurde. Zu diesem Zweck werden sowohl quantitative als auch qualitative Methoden herangezogen. Der Analyse liegt die Auswertung der Outputs von mehreren Sessions mit demselben Prompt in ChatGPT-3.5 zugrunde. Zum einen wird die algorithmische Leistung von intelligenten Systemen im Vergleich zu Daten aus lexikographischen Werken bewertet; zum anderen werden die gelieferten ChatGPT-Daten über konkrete Textteile der genannten lexikographischen Textsorte analysiert. Die Resultate dieser Studie tragen dazu bei, nicht nur den Effizienzgrad von diesem Chatbot hinsichtlich der Erstellung von Wörterbuchartikeln zu evaluieren, sondern auch in den Intelligenzbegriff, die Denkprozesse und die in beiden Disziplinen auszuführenden Handlungen zu vertiefen., By means of pilot experiments for the language pair German–Galician, this paper examines the concept of efficiency and intelligence in lexicography and artificial intelligence (AI). The aim of the experiments is to gain empirically and statistically based insights into the lexicographical text type ”dictionary article” in the responses of ChatGPT-3.5, as well as into the lexicographical data on which this chatbot was trained. Both quantitative and qualitative methods are used for this purpose. The analysis is based on the evaluation of the outputs of several sessions with the same prompt in ChatGPT-3.5. On the one hand, the algorithmic performance of intelligent systems is evaluated in comparison with data from lexicographical works; on the other hand, the ChatGPT data supplied is analysed using specific text passages of the aforementioned lexicographical text type. The results of this study not only help to evaluate the efficiency of this chatbot regarding the creation of dictionary articles, but also to delve deeper into the concept of intelligence, the thought processes and the actions to be carried out in both disciplines.
- Published
- 2024
234. Accessible and Ethical Data Annotation with the Application of Gamification
- Author
-
Gurav, Vedant, Parkar, Muhanned, Kharwar, Parth, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Kotenko, Igor, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Batra, Usha, editor, Roy, Nihar Ranjan, editor, and Panda, Brajendra, editor
- Published
- 2020
- Full Text
- View/download PDF
235. Group-Specific Training Data
- Author
-
Busath, Ben, Morgan, Jalen, Price, Joseph, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Koprinska, Irena, editor, Kamp, Michael, editor, Appice, Annalisa, editor, Loglisci, Corrado, editor, Antonie, Luiza, editor, Zimmermann, Albrecht, editor, Guidotti, Riccardo, editor, Özgöbek, Özlem, editor, Ribeiro, Rita P., editor, Gavaldà, Ricard, editor, Gama, João, editor, Adilova, Linara, editor, Krishnamurthy, Yamuna, editor, Ferreira, Pedro M., editor, Malerba, Donato, editor, Medeiros, Ibéria, editor, Ceci, Michelangelo, editor, Manco, Giuseppe, editor, Masciari, Elio, editor, Ras, Zbigniew W., editor, Christen, Peter, editor, Ntoutsi, Eirini, editor, Schubert, Erich, editor, Zimek, Arthur, editor, Monreale, Anna, editor, Biecek, Przemyslaw, editor, Rinzivillo, Salvatore, editor, Kille, Benjamin, editor, Lommatzsch, Andreas, editor, and Gulla, Jon Atle, editor
- Published
- 2020
- Full Text
- View/download PDF
236. New Application of the Requirements Elicitation Process for the Construction of Intelligent System-Based Predictive Models
- Author
-
Vegega, Cinthia, Pytel, Pablo, Pollo-Cattaneo, María Florencia, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Kotenko, Igor, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Pesado, Patricia, editor, and Arroyo, Marcelo, editor
- Published
- 2020
- Full Text
- View/download PDF
237. Airport and Ship Target Detection on Satellite Images Based on YOLO V3 Network
- Author
-
Ying, Ren, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Jiming, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Hirche, Sandra, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Möller, Sebastian, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zhang, Junjie James, Series Editor, Wang, Liheng, editor, Wu, Yirong, editor, and Gong, Jianya, editor
- Published
- 2020
- Full Text
- View/download PDF
238. Training Data on Recursive Parallel Processors for Deep Learning
- Author
-
Raheja, Shipra, Chopra, Rajiv, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Jiming, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Hirche, Sandra, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Liang, Qilian, Series Editor, Martin, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Möller, Sebastian, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zhang, Junjie James, Series Editor, Jain, Vanita, editor, Chaudhary, Gopal, editor, Taplamacioglu, M. Cengiz, editor, and Agarwal, M. S., editor
- Published
- 2020
- Full Text
- View/download PDF
239. Data Mining Based on Condensed Hierarchical Clustering Algorithm
- Author
-
Bi, Zengjun, Han, Yaoquan, Huang, Caiquan, Wang, Min, Kacprzyk, Janusz, Series Editor, Pal, Nikhil R., Advisory Editor, Bello Perez, Rafael, Advisory Editor, Corchado, Emilio S., Advisory Editor, Hagras, Hani, Advisory Editor, Kóczy, László T., Advisory Editor, Kreinovich, Vladik, Advisory Editor, Lin, Chin-Teng, Advisory Editor, Lu, Jie, Advisory Editor, Melin, Patricia, Advisory Editor, Nedjah, Nadia, Advisory Editor, Nguyen, Ngoc Thanh, Advisory Editor, Wang, Jun, Advisory Editor, Patnaik, Srikanta, editor, Wang, John, editor, Yu, Zhengtao, editor, and Dey, Nilanjan, editor
- Published
- 2020
- Full Text
- View/download PDF
240. I Don’t Have That Much Data! Reusing User Behavior Models for Websites from Different Domains
- Author
-
Bakaev, Maxim, Speicher, Maximilian, Heil, Sebastian, Gaedke, Martin, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Bielikova, Maria, editor, Mikkonen, Tommi, editor, and Pautasso, Cesare, editor
- Published
- 2020
- Full Text
- View/download PDF
241. Map of Land Cover Agreement: Ensambling Existing Datasets for Large-Scale Training Data Provision
- Author
-
Gorica Bratic, Daniele Oxoli, and Maria Antonia Brovelli
- Subjects
training data ,high-resolution land cover ,global land cover ,machine learning ,deep learning ,satellite image classification ,Science - Abstract
Land cover information plays a critical role in supporting sustainable development and informed decision-making. Recent advancements in satellite data accessibility, computing power, and satellite technologies have boosted large-extent high-resolution land cover mapping. However, retrieving a sufficient amount of reliable training data for the production of such land cover maps is typically a demanding task, especially using modern deep learning classification techniques that require larger training sample sizes compared to traditional machine learning methods. In view of the above, this study developed a new benchmark dataset called the Map of Land Cover Agreement (MOLCA). MOLCA was created by integrating multiple existing high-resolution land cover datasets through a consensus-based approach. Covering Sub-Saharan Africa, the Amazon, and Siberia, this dataset encompasses approximately 117 billion 10m pixels across three macro-regions. The MOLCA legend aligns with most of the global high-resolution datasets and consists of nine distinct land cover classes. Noteworthy advantages of MOLCA include a higher number of pixels as well as coverage for typically underrepresented regions in terms of training data availability. With an estimated overall accuracy of 96%, MOLCA holds great potential as a valuable resource for the production of future high-resolution land cover maps.
- Published
- 2023
- Full Text
- View/download PDF
242. Evaluating the Effect of Training Data Size and Composition on the Accuracy of Smallholder Irrigated Agriculture Mapping in Mozambique Using Remote Sensing and Machine Learning Algorithms
- Author
-
Timon Weitkamp and Poolad Karimi
- Subjects
irrigated agriculture ,training data ,sub-Saharan Africa ,machine-learning algorithms ,class imbalance ,Science - Abstract
Mapping smallholder irrigated agriculture in sub-Saharan Africa using remote sensing techniques is challenging due to its small and scattered areas and heterogenous cropping practices. A study was conducted to examine the impact of sample size and composition on the accuracy of classifying irrigated agriculture in Mozambique’s Manica and Gaza provinces using three algorithms: random forest (RF), support vector machine (SVM), and artificial neural network (ANN). Four scenarios were considered, and the results showed that smaller datasets can achieve high and sufficient accuracies, regardless of their composition. However, the user and producer accuracies of irrigated agriculture do increase when the algorithms are trained with larger datasets. The study also found that the composition of the training data is important, with too few or too many samples of the “irrigated agriculture” class decreasing overall accuracy. The algorithms’ robustness depends on the training data’s composition, with RF and SVM showing less decrease and spread in accuracies than ANN. The study concludes that the training data size and composition are more important for classification than the algorithms used. RF and SVM are more suitable for the task as they are more robust or less sensitive to outliers than the ANN. Overall, the study provides valuable insights into mapping smallholder irrigated agriculture in sub-Saharan Africa using remote sensing techniques.
- Published
- 2023
- Full Text
- View/download PDF
243. Training Data Selection for Record Linkage Classification
- Author
-
Zaturrawiah Ali Omar, Zamira Hasanah Zamzuri, Noratiqah Mohd Ariff, and Mohd Aftar Abu Bakar
- Subjects
record linkage ,unsupervised random forest ,similarity measure ,training data ,Mathematics ,QA1-939 - Abstract
This paper presents a new two-step approach for record linkage, focusing on the creation of high-quality training data in the first step. The approach employs the unsupervised random forest model as a similarity measure to produce a similarity score vector for record matching. Three constructions were proposed to select non-match pairs for the training data, with both balanced (symmetry) and imbalanced (asymmetry) distributions tested. The top and imbalanced construction was found to be the most effective in producing training data with 100% correct labels. Random forest and support vector machine classification algorithms were compared, and random forest with the top and imbalanced construction produced an F1-score comparable to probabilistic record linkage using the expectation maximisation algorithm and EpiLink. On average, the proposed approach using random forests and the top and imbalanced construction improved the F1-score by 1% and recall by 6.45% compared to existing record linkage methods. By emphasising the creation of high-quality training data, this new approach has the potential to improve the accuracy and efficiency of record linkage for a wide range of applications.
- Published
- 2023
- Full Text
- View/download PDF
244. Stochastic Mirror Descent on Overparameterized Nonlinear Models.
- Author
-
Azizan, Navid, Lale, Sahin, and Hassibi, Babak
- Subjects
- *
MACHINE learning , *MIRRORS , *LEARNING problems , *DEEP learning - Abstract
Most modern learning problems are highly overparameterized, i.e., have many more model parameters than the number of training data points. As a result, the training loss may have infinitely many global minima (parameter vectors that perfectly “interpolate” the training data). It is therefore imperative to understand which interpolating solutions we converge to, how they depend on the initialization and learning algorithm, and whether they yield different test errors. In this article, we study these questions for the family of stochastic mirror descent (SMD) algorithms, of which stochastic gradient descent (SGD) is a special case. Recently, it has been shown that for overparameterized linear models, SMD converges to the closest global minimum to the initialization point, where closeness is in terms of the Bregman divergence corresponding to the potential function of the mirror descent. With appropriate initialization, this yields convergence to the minimum-potential interpolating solution, a phenomenon referred to as implicit regularization. On the theory side, we show that for sufficiently-overparameterized nonlinear models, SMD with a (small enough) fixed step size converges to a global minimum that is “very close” (in Bregman divergence) to the minimum-potential interpolating solution, thus attaining approximate implicit regularization. On the empirical side, our experiments on the MNIST and CIFAR-10 datasets consistently confirm that the above phenomenon occurs in practical scenarios. They further indicate a clear difference in the generalization performances of different SMD algorithms: experiments on the CIFAR-10 dataset with different regularizers, $\ell _{1}$ to encourage sparsity, $\ell _{2}$ (SGD) to encourage small Euclidean norm, and $\ell _{\infty }$ to discourage large components, surprisingly show that the $\ell _{\infty }$ norm consistently yields better generalization performance than SGD, which in turn generalizes better than the $\ell _{1}$ norm. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
245. Network Pruning Using Adaptive Exemplar Filters.
- Author
-
Lin, Mingbao, Ji, Rongrong, Li, Shaojie, Wang, Yan, Wu, Yongjian, Huang, Feiyue, and Ye, Qixiang
- Subjects
- *
ADAPTIVE filters , *MESSAGE passing (Computer science) , *COMMUNITIES , *COMPUTER architecture - Abstract
Popular network pruning algorithms reduce redundant information by optimizing hand-crafted models, and may cause suboptimal performance and long time in selecting filters. We innovatively introduce adaptive exemplar filters to simplify the algorithm design, resulting in an automatic and efficient pruning approach called EPruner. Inspired by the face recognition community, we use a message-passing algorithm Affinity Propagation on the weight matrices to obtain an adaptive number of exemplars, which then act as the preserved filters. EPruner breaks the dependence on the training data in determining the “important” filters and allows the CPU implementation in seconds, an order of magnitude faster than GPU-based SOTAs. Moreover, we show that the weights of exemplars provide a better initialization for the fine-tuning. On VGGNet-16, EPruner achieves a 76.34%-FLOPs reduction by removing 88.80% parameters, with 0.06% accuracy improvement on CIFAR-10. In ResNet-152, EPruner achieves a 65.12%-FLOPs reduction by removing 64.18% parameters, with only 0.71% top-5 accuracy loss on ILSVRC-2012. Our code is available at https://github.com/lmbxmu/EPruner. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
246. Consistent Meta-Regularization for Better Meta-Knowledge in Few-Shot Learning.
- Author
-
Tian, Pinzhuo, Li, Wenbin, and Gao, Yang
- Subjects
- *
MACHINE learning , *DEEP learning , *TECHNOLOGICAL innovations - Abstract
Recently, meta-learning provides a powerful paradigm to deal with the few-shot learning problem. However, existing meta-learning approaches ignore the prior fact that good meta-knowledge should alleviate the data inconsistency between training and test data, caused by the extremely limited data, in each few-shot learning task. Moreover, legitimately utilizing the prior understanding of meta-knowledge can lead us to design an efficient method to improve the meta-learning model. Under this circumstance, we consider the data inconsistency from the distribution perspective, making it convenient to bring in the prior fact, and propose a new consistent meta-regularization (Con-MetaReg) to help the meta-learning model learn how to reduce the data-distribution discrepancy between the training and test data. In this way, the ability of meta-knowledge on keeping the training and test data consistent is enhanced, and the performance of the meta-learning model can be further improved. The extensive analyses and experiments demonstrate that our method can indeed improve the performances of different meta-learning models in few-shot regression, classification, and fine-grained classification. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
247. Elastic Net Nonparallel Hyperplane Support Vector Machine and Its Geometrical Rationality.
- Author
-
Qi, Kai and Yang, Hu
- Subjects
- *
QUADRATIC programming , *PETRI nets , *SUPPORT vector machines , *HYPERPLANES - Abstract
Twin support vector machine (TWSVM), which constructs two nonparallel classifying hyperplanes, is widely applied to various fields. However, TWSVM solves two quadratic programming problems (QPPs) separately such that the final classifiers lack consistency and enough prediction accuracy. Moreover, by reason of only considering the 1-norm penalty for slack variables, TWSVM is not well defined in the geometrical view. In this article, we propose a novel elastic net nonparallel hyperplane support vector machine (ENNHSVM), which adopts elastic net penalty for slack variables and constructs two nonparallel separating hyperplanes simultaneously. We further discuss the properties of ENNHSVM theoretically and derive the violation tolerance upper bound to better demonstrate the relative violations of training samples in the same class. In particular, we design a safe screening rule for ENNHSVM to speed up the calculations. We finally compare the performance of ENNHSVM on both synthetic datasets and benchmark datasets with the Lagrangian SVM, the twin parametric-margin SVM, the elastic net SVM, the TWSVM, and the nonparallel hyperplane SVM. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
248. A Sensitivity-Based Data Augmentation Framework for Model Predictive Control Policy Approximation.
- Author
-
Krishnamoorthy, Dinesh
- Subjects
- *
DATA augmentation , *SUPERVISED learning , *PREDICTION models , *APPROXIMATION algorithms , *DEEP learning - Abstract
Approximating model predictive control (MPC) policy using expert-based supervised learning techniques requires labeled training datasets sampled from the MPC policy. This is typically obtained by sampling the feasible state space and evaluating the control law by solving the numerical optimization problem offline for each sample. Although the resulting approximate policy can be cheaply evaluated online, generating large training samples to learn the MPC policy can be time-consuming and prohibitively expensive. This is one of the fundamental bottlenecks that limit the design and implementation of MPC policy approximation. This technical article aims to address this challenge, and proposes a novel sensitivity-based data augmentation scheme for direct policy approximation. The proposed approach is based on exploiting the parametric sensitivities to cheaply generate additional training samples in the neighborhood of the existing samples. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
249. Unsupervised Deep Background Matting Using Deep Matte Prior.
- Author
-
Xu, Yong, Liu, Baoling, Quan, Yuhui, and Ji, Hui
- Subjects
- *
DEEP learning , *SUPERVISED learning , *CONVOLUTIONAL neural networks , *VIDEO editing - Abstract
Background matting is a recently developed image matting approach, with applications to image and video editing. It refers to estimating both the alpha matte and foreground from a pair of images with and without foreground objects. Recent work has applied deep learning to background matting, with very promising performance achieved. However, existing deep models are supervised which require a large dataset with ground truth alpha mattes for training. To avoid the cost of data collection and possible bias in training data, this paper proposes a dataset-free unsupervised deep learning-based approach for background matting. Observing that the local smoothness of alpha matte can be well characterized by the untrained network prior called deep matte prior, we model the foreground and alpha matte using the priors encoded by two generative convolutional neural networks. To avoid possible overfitting during unsupervised learning, a two-stage learning scheme is developed which contains projection-based training and Bayesian post refinement. An alpha-matte-driven initialization scheme is also developed for performance boost. Even without calling external training data, the proposed approach provides competitive performance to recent supervised learning-based methods in the experiments. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
250. A Continual Learning Survey: Defying Forgetting in Classification Tasks.
- Author
-
De Lange, Matthias, Aljundi, Rahaf, Masana, Marc, Parisot, Sarah, Jia, Xu, Leonardis, Ales, Slabaugh, Gregory, and Tuytelaars, Tinne
- Subjects
- *
TASKS , *ARTIFICIAL neural networks - Abstract
Artificial neural networks thrive in solving the classification problem for a particular rigid task, acquiring knowledge through generalized learning behaviour from a distinct training phase. The resulting network resembles a static entity of knowledge, with endeavours to extend this knowledge without targeting the original task resulting in a catastrophic forgetting. Continual learning shifts this paradigm towards networks that can continually accumulate knowledge over different tasks without the need to retrain from scratch. We focus on task incremental classification, where tasks arrive sequentially and are delineated by clear boundaries. Our main contributions concern: (1) a taxonomy and extensive overview of the state-of-the-art; (2) a novel framework to continually determine the stability-plasticity trade-off of the continual learner; (3) a comprehensive experimental comparison of 11 state-of-the-art continual learning methods; and (4) baselines. We empirically scrutinize method strengths and weaknesses on three benchmarks, considering Tiny Imagenet and large-scale unbalanced iNaturalist and a sequence of recognition datasets. We study the influence of model capacity, weight decay and dropout regularization, and the order in which the tasks are presented, and qualitatively compare methods in terms of required memory, computation time, and storage. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.