408 results on '"Training data"'
Search Results
2. Artistic Essence of Generative Adversarial Networks: Analyzing Training Data's Impact on Performance.
- Author
-
Pal, Kuldeep, Chaudhuri, Rapti, Deb, Suman, and Saha, Ashim
- Subjects
GENERATIVE adversarial networks ,DEEP learning ,ARTIFICIAL intelligence - Abstract
Generative adversarial networks (GANs) are powerful deep learning models for synthesizing realistic data. However, their performance critically depends on curating optimal training data. This research conducts a comprehensive study analyzing the impact of sample size, class balance, and heterogeneity in training datasets on GAN image and text generation quality. Through extensive experiments on CIFAR-10, it has been demonstrated that insufficient samples, imbalanced classes, and lack of diversity cause degraded sample quality, coherence, and mode collapse. The analysis conducted in this research work provides unique insights into data-GAN interplay. Models trained on balanced subsets with adequate samples per class produce superior Inception Scores and BLEU, avoiding limited variety in outputs. The techniques presented enable developing more generalizable and creative GANs. This work proves to be the first of its kind to rigorously evaluate the role of data characteristics like size, balance and heterogeneity in stabilizing GAN training and improving output fdelity across modalities. The data-centric findings would be valuable for researchers to curate optimal datasets that can unlock GANs' full potential for diverse, realistic generation with wide applications in graphics, vision, language and beyond. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. Detection of Bacterial Spot Disease in Bell Pepper Plant Using YOLOv3.
- Author
-
Mahesh, Therese Yamuna and Mathew, Midhun P.
- Subjects
- *
BACTERIAL diseases , *OBJECT recognition (Computer vision) , *PLANT diseases , *FARM produce , *DEEP learning , *BELL pepper - Abstract
In countries like India, diseases in plants are a major concern in the agricultural sector. Crop loss due to diseases led to reduction in the quality and quantity of agricultural products. This also leads to economic losses. Hence, timely monitoring of plants is necessary. But monitoring diseases in large fields is a difficult task. To overcome this problem, effective management strategies should be taken to control diseases in plants. It can be done by acquiring data for disease identification and automation of the recognition of diseases. Deep learning is of great use in this area using the principles of object detection. In this paper, we are using YOLOv3 (you only look once) to determine the diseases in plants, based on the symptoms seen on the leaves. The advantage of using YOLOv3 is that multiple diseases can be detected on the image of a single leaf. An important feature of YOLOv3 is that it can detect small disease spots seen on the leaves. Here, we are concentrating on the bacterial spot disease seen on the bell pepper plant. The identification results show a mean average precision of 90%. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. Map of Land Cover Agreement: Ensambling Existing Datasets for Large-Scale Training Data Provision.
- Author
-
Bratic, Gorica, Oxoli, Daniele, and Brovelli, Maria Antonia
- Subjects
- *
DEEP learning , *LAND cover , *IMAGE recognition (Computer vision) , *MACHINE learning , *MAPS , *SUSTAINABLE development - Abstract
Land cover information plays a critical role in supporting sustainable development and informed decision-making. Recent advancements in satellite data accessibility, computing power, and satellite technologies have boosted large-extent high-resolution land cover mapping. However, retrieving a sufficient amount of reliable training data for the production of such land cover maps is typically a demanding task, especially using modern deep learning classification techniques that require larger training sample sizes compared to traditional machine learning methods. In view of the above, this study developed a new benchmark dataset called the Map of Land Cover Agreement (MOLCA). MOLCA was created by integrating multiple existing high-resolution land cover datasets through a consensus-based approach. Covering Sub-Saharan Africa, the Amazon, and Siberia, this dataset encompasses approximately 117 billion 10m pixels across three macro-regions. The MOLCA legend aligns with most of the global high-resolution datasets and consists of nine distinct land cover classes. Noteworthy advantages of MOLCA include a higher number of pixels as well as coverage for typically underrepresented regions in terms of training data availability. With an estimated overall accuracy of 96%, MOLCA holds great potential as a valuable resource for the production of future high-resolution land cover maps. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
5. GANs-Based Intracoronary Optical Coherence Tomography Image Augmentation for Improved Plaques Characterization Using Deep Neural Networks
- Author
-
Haroon Zafar, Junaid Zafar, and Faisal Sharif
- Subjects
optical coherence tomography ,data augmentation ,generative adversarial networks ,deep learning ,coronary plaques ,training data ,Optics. Light ,QC350-467 ,Applied optics. Photonics ,TA1501-1820 - Abstract
Data augmentation using generative adversarial networks (GANs) is vital in the creation of new instances that include imaging modality tasks for improved deep learning classification. In this study, conditional generative adversarial networks (cGANs) were used on a dataset of OCT (Optical Coherence Tomography)-acquired images of coronary atrial plaques for synthetic data creation for the first time, and further validated using deep learning architecture. A new OCT images dataset of 51 patients marked by three professionals was created and programmed. We used cGANs to synthetically populate the coronary aerial plaques dataset by factors of 5×, 10×, 50× and 100× from a limited original dataset to enhance its volume and diversification. The loss functions for the generator and the discriminator were set up to generate perfect aliases. The augmented OCT dataset was then used in the training phase of the leading AlexNet architecture. We used cGANs to create synthetic images and envisaged the impact of the ratio of real data to synthetic data on classification accuracy. We illustrated through experiments that augmenting real images with synthetic images by a factor of 50× during training helped improve the test accuracy of the classification architecture for label prediction by 15.8%. Further, we performed training time assessments against a number of iterations to identify optimum time efficiency. Automated plaques detection was found to be in conformity with clinical results using our proposed class conditioning GAN architecture.
- Published
- 2023
- Full Text
- View/download PDF
6. Bayesian Convolutional Neural Networks for Limited Data Hyperspectral Remote Sensing Image Classification.
- Author
-
Joshaghani, Mohammad, Davari, Amirabbas, Hatamian, Faezeh Nejati, Maier, Andreas, and Riess, Christian
- Abstract
Hyperspectral remote sensing (HSRS) images have high dimensionality, and labeling HSRS data is expensive and therefore limited to small amounts of pixels. This makes it challenging to use deep neural networks for HSRS image classification. In extreme cases, deep neural networks are even outperformed by traditional models. In this work, we propose to use Bayesian convolutional neural networks (BCNNs) as a potential alternative to convolutional neural networks (CNNs). BCNNs benefit from Bayesian learning, which is more robust against overfitting and inherently provides a measure for uncertainty. We show in experiments on the Pavia Centre, Salinas, and Botswana datasets that a BCNN outperforms a similarly constructed non-Bayesian CNN, an off-the-shelf random forest (RF), and a state-of-the-art Bayesian neural network (BNN). We also show that BCNN is more robust against overfitting compared with the CNN. Furthermore, the BCNN exhibits a remarkably larger capacity for model compression, which makes BCNN a better candidate in hardware-constrained settings. Finally, we show that the BCNN’s uncertainty measure can effectively identify misclassified samples. This useful property can be used to detect mislabeled data or to reject predictions with low confidence. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
7. GANs-Based Intracoronary Optical Coherence Tomography Image Augmentation for Improved Plaques Characterization Using Deep Neural Networks.
- Author
-
Zafar, Haroon, Zafar, Junaid, and Sharif, Faisal
- Subjects
ARTIFICIAL neural networks ,GENERATIVE adversarial networks ,DATA augmentation ,DEEP learning ,OPTICAL coherence tomography - Abstract
Data augmentation using generative adversarial networks (GANs) is vital in the creation of new instances that include imaging modality tasks for improved deep learning classification. In this study, conditional generative adversarial networks (cGANs) were used on a dataset of OCT (Optical Coherence Tomography)-acquired images of coronary atrial plaques for synthetic data creation for the first time, and further validated using deep learning architecture. A new OCT images dataset of 51 patients marked by three professionals was created and programmed. We used cGANs to synthetically populate the coronary aerial plaques dataset by factors of 5×, 10×, 50× and 100× from a limited original dataset to enhance its volume and diversification. The loss functions for the generator and the discriminator were set up to generate perfect aliases. The augmented OCT dataset was then used in the training phase of the leading AlexNet architecture. We used cGANs to create synthetic images and envisaged the impact of the ratio of real data to synthetic data on classification accuracy. We illustrated through experiments that augmenting real images with synthetic images by a factor of 50× during training helped improve the test accuracy of the classification architecture for label prediction by 15.8%. Further, we performed training time assessments against a number of iterations to identify optimum time efficiency. Automated plaques detection was found to be in conformity with clinical results using our proposed class conditioning GAN architecture. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
8. Information Losses in Neural Classifiers From Sampling
- Author
-
Foggo, Brandon, Yu, Nanpeng, Shi, Jie, and Gao, Yuanqi
- Subjects
Information and Computing Sciences ,Machine Learning ,Neural networks ,Machine learning ,Training ,Random variables ,Training data ,Probability distribution ,Learning systems ,Deep learning ,information theory ,large deviations theory ,mutual information ,statistical learning theory ,cs.LG ,stat.ML ,Artificial Intelligence & Image Processing ,Artificial intelligence - Abstract
This article considers the subject of information losses arising from the finite data sets used in the training of neural classifiers. It proves a relationship between such losses as the product of the expected total variation of the estimated neural model with the information about the feature space contained in the hidden representation of that model. It then bounds this expected total variation as a function of the size of randomly sampled data sets in a fairly general setting, and without bringing in any additional dependence on model complexity. It ultimately obtains bounds on information losses that are less sensitive to input compression and in general much smaller than existing bounds. This article then uses these bounds to explain some recent experimental findings of information compression in neural networks that cannot be explained by previous work. Finally, this article shows that not only are these bounds much smaller than existing ones, but they also correspond well with experiments.
- Published
- 2020
9. Semantic Adversarial Deep Learning
- Author
-
Seshia, Sanjit A, Jha, Somesh, and Dreossi, Tommaso
- Subjects
Semantics ,Machine learning algorithms ,Neural networks ,Training data ,Deep learning ,Adversarial machine learning - Abstract
Adversarial examples have emerged as a key threat for machine-learning-based systems, especially the ones that employ deep neural networks. Unlike a large body of research in this area, this Keynote article accounts for the semantic, context, and specifications of the complete system with machine learning components in resource-constrained environments. - Muhammad Shafique, Technische Universität Wien.
- Published
- 2020
10. Synthesizing Point Cloud Data Set for Historical Dome Systems
- Author
-
Güneş, Mustafa Cem, Mertan, Alican, Sahin, Yusuf H., Unal, Gozde, Özkar, Mine, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Gerber, David, editor, Pantazis, Evangelos, editor, Bogosian, Biayna, editor, Nahmad, Alicia, editor, and Miltiadis, Constantinos, editor
- Published
- 2022
- Full Text
- View/download PDF
11. MERASTC : Micro-Expression Recognition Using Effective Feature Encodings and 2D Convolutional Neural Network.
- Author
-
Gupta, Puneet
- Abstract
Facial micro-expression (ME) can disclose genuine and concealed human feelings. It makes MEs extensively useful in real-world applications pertaining to affective computing and psychology. Unfortunately, they are induced by subtle facial movements for a short duration of time, which makes the ME recognition, a highly challenging problem even for human beings. In automatic ME recognition, the well-known features encode either incomplete or redundant information, and there is a lack of sufficient training data. The proposed method, Micro-Expression Recognition by Analysing Spatial and Temporal Characteristics, $MERASTC$ M E R A S T C mitigates these issues for improving the ME recognition. It compactly encodes the subtle deformations using action units (AUs), landmarks, gaze, and appearance features of all the video frames while preserving most of the relevant ME information. Furthermore, it improves the efficacy by introducing a novel neutral face normalization for ME and initiating the utilization of gaze features in deep learning-based ME recognition. The features are provided to the 2D convolutional neural network that jointly analyses the spatial and temporal behavior for correct ME classification. Experimental results1 on publicly available datasets indicate that the proposed method exhibits better performance than the well-known methods. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
12. Contrastive-ACE: Domain Generalization Through Alignment of Causal Mechanisms.
- Author
-
Wang, Yunqi, Liu, Furui, Chen, Zhitang, Wu, Yik-Chung, Hao, Jianye, Chen, Guangyong, and Heng, Pheng-Ann
- Subjects
- *
GENERALIZATION - Abstract
Domain generalization aims to learn knowledge invariant across different distributions while semantically meaningful for downstream tasks from multiple source domains, to improve the model’s generalization ability on unseen target domains. The fundamental objective is to understand the underlying ”invariance” behind these observational distributions and such invariance has been shown to have a close connection to causality. While many existing approaches make use of the property that causal features are invariant across domains, we consider the invariance of the average causal effect of the features to the labels. This invariance regularizes our training approach in which interventions are performed on features to enforce stability of the causal prediction by the classifier across domains. Our work thus sheds some light on the domain generalization problem by introducing invariance of the mechanisms into the learning process. Experiments on several benchmark datasets demonstrate the performance of the proposed method against SOTAs. The codes are available at: https://github.com/lithostark/Contrastive-ACE. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
13. Strict rule-based automatic training data extraction using Mobile laser scanning in urban area.
- Author
-
Ma, Zhenyu, Oude Elberink, Sander, Lin, Yaping, Xu, Panpan, Xiang, Binbin, Koch, Barbara, and Weinacker, Holger
- Subjects
- *
DATA extraction , *DEEP learning , *CITIES & towns , *POINT cloud , *LASERS , *TRAINING manuals - Abstract
To reduce the cost of manually annotating training data for supervised classifiers, we propose an automated approach to extract training data of urban objects in six classes: buildings, fences, man-made poles, vegetation, vehicles, and low objects. In this study, two segmentation algorithms are firstly implemented to generate meaningful objects from the non-ground point cloud. Then, we generated valid strict rules to label partial RANSAC (Random Sample Consensus) planes and meaningful objects as training data. The strict rules are built upon the semantic knowledge formed by the features of geometric, eigenvalue, RANSAC plane, multidimensional slice, and relative location. The accuracy of strict rule-based (SRB) training data is higher than 98.5 % for buildings, man-made poles, vegetation, and vehicles. The accuracy of low objects and fences reaches 97.10 % and 94.99 %, respectively. Finally, we compared the performance of the KPConv and PointNET++ networks trained by SRB and manually labeled training data to evaluate the effectiveness of our training data. The KPConv overall accuracy using manually labeled and (SRB) training data are 91.5 % and 86.8 % in the Paris dataset, 95.6 % and 92.0 % in the Freiburg dataset, respectively. The experiments demonstrate that automatically labeled training data can achieve similar accuracy compared to manual labels when coupled with two deep learning networks. Therefore, SRB training data extraction can effectively deal with the problem of training data scarcity and provide significant advancements in urban point cloud classification, where manual labeling of training data remains a crucial challenge. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
14. Development of System for Collecting User-specified Training Data for Autonomous Driving Based on Virtual Road Environment.
- Author
-
Min-Soo Kim and In-Sung Jang
- Abstract
Deep learning technologies that use road images to recognize autonomous driving environments have been actively developed. Such deep-learning-based autonomous driving technologies need a large amount of training data that can represent various road, traffic, and weather environments. However, there have been many difficulties in terms of time and cost in collecting training data that can represent various road environments. Therefore, in this study, we attempt to build a virtual road environment and develop a system for collecting training data based on the virtual environment. To build a virtual environment identical to the real world, we convert and use two kinds of existing geospatial data: high-definition 3D buildings and highdefinition roads. We also develop a system for collecting training data running in the virtual environment. The implementation results of the proposed system show that it is possible to build a virtual environment identical to the real world and to collect specific training data quickly and at any time from the virtual environment with various user-specified settings. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
15. Low-Light Image and Video Enhancement Using Deep Learning: A Survey.
- Author
-
Li, Chongyi, Guo, Chunle, Han, Linghao, Jiang, Jun, Cheng, Ming-Ming, Gu, Jinwei, and Loy, Chen Change
- Subjects
- *
DEEP learning , *IMAGE intensifiers , *CELL phones , *CAMERA phones , *IMAGE enhancement (Imaging systems) , *COMPUTATIONAL photography - Abstract
Low-light image enhancement (LLIE) aims at improving the perception or interpretability of an image captured in an environment with poor illumination. Recent advances in this area are dominated by deep learning-based solutions, where many learning strategies, network structures, loss functions, training data, etc. have been employed. In this paper, we provide a comprehensive survey to cover various aspects ranging from algorithm taxonomy to unsolved open issues. To examine the generalization of existing methods, we propose a low-light image and video dataset, in which the images and videos are taken by different mobile phones’ cameras under diverse illumination conditions. Besides, for the first time, we provide a unified online platform that covers many popular LLIE methods, of which the results can be produced through a user-friendly web interface. In addition to qualitative and quantitative evaluation of existing methods on publicly available and our proposed datasets, we also validate their performance in face detection in the dark. This survey together with the proposed dataset and online platform could serve as a reference source for future study and promote the development of this research field. The proposed platform and dataset as well as the collected methods, datasets, and evaluation metrics are publicly available and will be regularly updated. Project page: https://www.mmlab-ntu.com/project/lliv_survey/index.html. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
16. Unsupervised Learning of Local Equivariant Descriptors for Point Clouds.
- Author
-
Marcon, Marlon, Spezialetti, Riccardo, Salti, Samuele, Silva, Luciano, and Stefano, Luigi Di
- Subjects
- *
POINT cloud , *COMPUTER vision , *DATA augmentation , *COMPUTER architecture - Abstract
Correspondences between 3D keypoints generated by matching local descriptors are a key step in 3D computer vision and graphic applications. Learned descriptors are rapidly evolving and outperforming the classical handcrafted approaches in the field. Yet, to learn effective representations they require supervision through labeled data, which are cumbersome and time-consuming to obtain. Unsupervised alternatives exist, but they lag in performance. Moreover, invariance to viewpoint changes is attained either by relying on data augmentation, which is prone to degrading upon generalization on unseen datasets, or by learning from handcrafted representations of the input which are already rotation invariant but whose effectiveness at training time may significantly affect the learned descriptor. We show how learning an equivariant 3D local descriptor instead of an invariant one can overcome both issues. LEAD (Local EquivAriant Descriptor) combines Spherical CNNs to learn an equivariant representation together with plane-folding decoders to learn without supervision. Through extensive experiments on standard surface registration datasets, we show how our proposal outperforms existing unsupervised methods by a large margin and achieves competitive results against the supervised approaches, especially in the practically very relevant scenario of transfer learning. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
17. Using Simulated Training Data of Voxel-Level Generative Models to Improve 3D Neuron Reconstruction.
- Author
-
Liu, Chao, Wang, Deli, Zhang, Han, Wu, Wei, Sun, Wenzhi, Zhao, Ting, and Zheng, Nenggan
- Subjects
- *
IMAGE segmentation , *NEURONS , *THREE-dimensional imaging , *DEEP learning , *IMAGE reconstruction - Abstract
Reconstructing neuron morphologies from fluorescence microscope images plays a critical role in neuroscience studies. It relies on image segmentation to produce initial masks either for further processing or final results to represent neuronal morphologies. This has been a challenging step due to the variation and complexity of noisy intensity patterns in neuron images acquired from microscopes. Whereas progresses in deep learning have brought the goal of accurate segmentation much closer to reality, creating training data for producing powerful neural networks is often laborious. To overcome the difficulty of obtaining a vast number of annotated data, we propose a novel strategy of using two-stage generative models to simulate training data with voxel-level labels. Trained upon unlabeled data by optimizing a novel objective function of preserving predefined labels, the models are able to synthesize realistic 3D images with underlying voxel labels. We showed that these synthetic images could train segmentation networks to obtain even better performance than manually labeled data. To demonstrate an immediate impact of our work, we further showed that segmentation results produced by networks trained upon synthetic data could be used to improve existing neuron reconstruction methods. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
18. Lower Bounds on the Generalization Error of Nonlinear Learning Models.
- Author
-
Seroussi, Inbar and Zeitouni, Ofer
- Subjects
- *
GENERALIZATION , *RANDOM matrices , *ARTIFICIAL neural networks , *COMPLEXITY (Philosophy) - Abstract
We study in this paper lower bounds for the generalization error of models derived from multi-layer neural networks, in the regime where the size of the layers is commensurate with the number of samples in the training data. We derive explicit generalization lower bounds for general biased estimators, in the cases of two-layered networks. For linear activation function, the bound is asymptotically tight. In the nonlinear case, we provide a comparison of our bounds with an empirical study of the stochastic gradient descent algorithm. In addition, we derive bounds for unbiased estimators, which show that the latter have unacceptable performance for truly nonlinear networks. The analysis uses elements from the theory of large random matrices. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
19. Self-Supervised Low-Light Image Enhancement Using Discrepant Untrained Network Priors.
- Author
-
Liang, Jinxiu, Xu, Yong, Quan, Yuhui, Shi, Boxin, and Ji, Hui
- Subjects
- *
DEEP learning , *IMAGE intensifiers , *ARTIFICIAL neural networks - Abstract
This paper proposes a deep learning method for low-light image enhancement, which exploits the generation capability of Neural Networks (NNs) while requiring no training samples except the input image itself. Based on the Retinex decomposition model, the reflectance and illumination of a low-light image are parameterized by two untrained NNs. The ambiguity between the two layers is resolved by the discrepancy between the two NNs in terms of architecture and capacity, while the complex noise with spatially-varying characteristics is handled by an illumination-adaptive self-supervised denoising module. The enhancement is done by jointly optimizing the Retinex decomposition and the illumination adjustment. Extensive experiments show that the proposed method not only outperforms existing non-learning-based and unsupervised-learning-based methods, but also competes favorably with some supervised-learning-based methods in extreme low-light conditions. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
20. DPODv2: Dense Correspondence-Based 6 DoF Pose Estimation.
- Author
-
Shugurov, Ivan, Zakharov, Sergey, and Ilic, Slobodan
- Subjects
- *
DEEP learning , *POSE estimation (Computer vision) , *DETECTORS , *MONOCULARS - Abstract
We propose a three-stage 6 DoF object detection method called DPODv2 (Dense Pose Object Detector) that relies on dense correspondences. We combine a 2D object detector with a dense correspondence estimation network and a multi-view pose refinement method to estimate a full 6 DoF pose. Unlike other deep learning methods that are typically restricted to monocular RGB images, we propose a unified deep learning network allowing different imaging modalities to be used (RGB or Depth). Moreover, we propose a novel pose refinement method, that is based on differentiable rendering. The main concept is to compare predicted and rendered correspondences in multiple views to obtain a pose which is consistent with predicted correspondences in all views. Our proposed method is evaluated rigorously on different data modalities and types of training data in a controlled setup. The main conclusions is that RGB excels in correspondence estimation, while depth contributes to the pose accuracy if good 3D-3D correspondences are available. Naturally, their combination achieves the overall best performance. We perform an extensive evaluation and an ablation study to analyze and validate the results on several challenging datasets. DPODv2 achieves excellent results on all of them while still remaining fast and scalable independent of the used data modality and the type of training data. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
21. Improving Deep Metric Learning by Divide and Conquer.
- Author
-
Sanakoyeu, Artsiom, Ma, Pingchuan, Tschernezki, Vadim, and Ommer, Bjorn
- Subjects
- *
DEEP learning , *COMPUTER vision , *IMAGE retrieval , *APPLICATION software , *SUBSPACES (Mathematics) , *PHENYLKETONURIA , *VISUAL cryptography - Abstract
Deep metric learning (DML) is a cornerstone of many computer vision applications. It aims at learning a mapping from the input domain to an embedding space, where semantically similar objects are located nearby and dissimilar objects far from another. The target similarity on the training data is defined by user in form of ground-truth class labels. However, while the embedding space learns to mimic the user-provided similarity on the training data, it should also generalize to novel categories not seen during training. Besides user-provided groundtruth training labels, a lot of additional visual factors (such as viewpoint changes or shape peculiarities) exist and imply different notions of similarity between objects, affecting the generalization on the images unseen during training. However, existing approaches usually directly learn a single embedding space on all available training data, struggling to encode all different types of relationships, and do not generalize well. We propose to build a more expressive representation by jointly splitting the embedding space and the data hierarchically into smaller sub-parts. We successively focus on smaller subsets of the training data, reducing its variance and learning a different embedding subspace for each data subset. Moreover, the subspaces are learned jointly to cover not only the intricacies, but the breadth of the data as well. Only after that, we build the final embedding from the subspaces in the conquering stage. The proposed algorithm acts as a transparent wrapper that can be placed around arbitrary existing DML methods. Our approach significantly improves upon the state-of-the-art on image retrieval, clustering, and re-identification tasks evaluated using CUB200-2011, CARS196, Stanford Online Products, In-shop Clothes, and PKU VehicleID datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
22. Physics-Based Noise Modeling for Extreme Low-Light Photography.
- Author
-
Wei, Kaixuan, Fu, Ying, Zheng, Yinqiang, and Yang, Jiaolong
- Subjects
- *
IMAGE denoising , *DIGITAL electronics , *CONVOLUTIONAL neural networks , *DIGITAL cameras , *MACHINE learning , *NOISE , *DEEP learning , *DISTRIBUTION (Probability theory) , *PHYSICAL distribution of goods - Abstract
Enhancing the visibility in extreme low-light environments is a challenging task. Under nearly lightless condition, existing image denoising methods could easily break down due to significantly low SNR. In this paper, we systematically study the noise statistics in the imaging pipeline of CMOS photosensors, and formulate a comprehensive noise model that can accurately characterize the real noise structures. Our novel model considers the noise sources caused by digital camera electronics which are largely overlooked by existing methods yet have significant influence on raw measurement in the dark. It provides a way to decouple the intricate noise structure into different statistical distributions with physical interpretations. Moreover, our noise model can be used to synthesize realistic training data for learning-based low-light denoising algorithms. In this regard, although promising results have been shown recently with deep convolutional neural networks, the success heavily depends on abundant noisy-clean image pairs for training, which are tremendously difficult to obtain in practice. Generalizing their trained models to images from new devices is also problematic. Extensive experiments on multiple low-light denoising datasets – including a newly collected one in this work covering various devices – show that a deep neural network trained with our proposed noise formation model can reach surprisingly-high accuracy. The results are on par with or sometimes even outperform training with paired real data, opening a new door to real-world extreme low-light photography. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
23. Towards a training data model for artificial intelligence in earth observation.
- Author
-
Yue, Peng, Shangguan, Boyi, Hu, Lei, Jiang, Liangcun, Zhang, Chenxiao, Cao, Zhipeng, and Pan, Yinyin
- Subjects
- *
ARTIFICIAL intelligence , *SPATIAL data infrastructures , *DATA modeling , *ONLINE education , *DEEP learning - Abstract
Artificial Intelligence Machine Learning (AI/ML), in particular Deep Learning (DL), is reorienting and transforming Earth Observation (EO). A consistent data model for delivery of training data will support the FAIR data principles (findable, accessible, interoperable, reusable) and enable Web-based use of training data in a spatial data infrastructure (SDI). Existing training datasets, including open source benchmark datasets, are usually packaged into public or personal repositories and lack discoverability and accessibility. Moreover, there is no unified method to describe the training data. Here we propose a training data model for AI in EO to allow documentation, storage, and sharing of geospatial training data in a distributed infrastructure. We present design rationales, information models, and an encoding method. Several scenarios illustrate the intended uses and benefits for EO DL applications in an open Web environment. The relationship with Open Geospatial Consortium (OGC) standards is also discussed, as is the impact on an AI-ready SDI. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
24. The Irrelevance of the Turing Test in Current Deep Learning
- Author
-
Ondrej Hriadel
- Subjects
deep learning ,artificial intelligence ,turing test ,intelligence ,learning ,training data ,language ,Philosophy (General) ,B1-5802 - Abstract
The role of artificial intelligence in the Turing test is to imitate human beings to such an extent that people will not realize it is a machine. With the rise of deep learning (a subcategory of AI), the situation is changing rapidly as the new systems do not focus on imitating human intelligence but emphasize thorough solutions to specific issues. The main difference between predefined AI and deep learning (DL) is that these systems are self-learning and have verifiable results. Firstly, we need to analyse the application of the Turing test in the Loebner Prize because, there, the primary emphasis is on aspects of human intelligence – learning, reasoning and understanding. Secondly, in the Turing test, only general intelligence is considered, and this can be questionable. If DL does not possess this form of intelligence, by this reasoning, we should consider it unintelligent. However, is such understanding correct? The third and last aspect questions whether the Turing test is beneficial for an AI designed for specific tasks because the results do not bring any new data and conclusions.
- Published
- 2021
- Full Text
- View/download PDF
25. Fully-Automated Spike Detection and Dipole Analysis of Epileptic MEG Using Deep Learning.
- Author
-
Hirano, Ryoji, Emura, Takuto, Nakata, Otoichi, Nakashima, Toshiharu, Asai, Miyako, Kagitani-Shimono, Kuriko, Kishima, Haruhiko, and Hirata, Masayuki
- Subjects
- *
DEEP learning , *ARTIFICIAL intelligence , *EPILEPTIFORM discharges , *PEOPLE with epilepsy , *IMAGE processing , *MAGNETOENCEPHALOGRAPHY - Abstract
Magnetoencephalography (MEG) is a useful tool for clinically evaluating the localization of interictal spikes. Neurophysiologists visually identify spikes from the MEG waveforms and estimate the equivalent current dipoles (ECD). However, presently, these analyses are manually performed by neurophysiologists and are time-consuming. Another problem is that spike identification from MEG waveforms largely depends on neurophysiologists’ skills and experiences. These problems cause poor cost-effectiveness in clinical MEG examination. To overcome these problems, we fully automated spike identification and ECD estimation using a deep learning approach fully automated AI-based MEG interictal epileptiform discharge identification and ECD estimation (FAMED). We applied a semantic segmentation method, which is an image processing technique, to identify the appropriate times between spike onset and peak and to select appropriate sensors for ECD estimation. FAMED was trained and evaluated using clinical MEG data acquired from 375 patients. FAMED training was performed in two stages: in the first stage, a classification network was learned, and in the second stage, a segmentation network that extended the classification network was learned. The classification network had a mean AUC of 0.9868 (10-fold patient-wise cross-validation); the sensitivity and specificity were 0.7952 and 0.9971, respectively. The median distance between the ECDs estimated by the neurophysiologists and those using FAMED was 0.63 cm. Thus, the performance of FAMED is comparable to that of neurophysiologists, and it can contribute to the efficiency and consistency of MEG ECD analysis. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
26. A Cross-Domain Federated Learning Framework for Wireless Human Sensing.
- Author
-
Zhang, Kaixuan, Liu, Xiulong, Xie, Xin, Zhang, Jiuwu, Niu, Bingxin, and Li, Keqiu
- Subjects
- *
HUMAN activity recognition , *DEEP learning , *SUPERVISED learning , *HUMAN-computer interaction , *WIRELESS sensor networks - Abstract
In this article, we study the problem of wireless human sensing, which refers to human activity recognition (HAR). HAR based on wireless signals plays an important role in security, human-computer interaction, and healthcare in the 5G era. Most state-of-the-art human activity recognition applications rely on deep learning approaches, which require a large amount of training data to achieve good performance. However, wireless signal data is difficult to collect and label, and it also carries private information, making it challenging to construct large-scale datasets.The recent advances in federated learning provide a chance to aggregate a wide range of users to collaboratively train a HAR model using decentralized datasets under data-preserving constraints. However, since a wireless signal is easily interrupted by the environment, the data across all participants is non-IID, thus decreasing the performance of an aggregated model. Additionally, due to the resource-constrained nature of edge devices, training the HAR model on an end user usually takes too long, resulting in straggler problems in federated learning training. In this article, we proposed a cross-domain federated learning framework (CDFL) to address the lack of labeled wireless data. A transfer learning approach was proposed to simulate wireless data by converting from widely available image datasets, and solving the distribution mismatch problem by domain adaption. Additionally, a customized federated learning approach was proposed to reduce the computational overhead of local model training. Using a case study of ultrasonic signal-based gesture recognition, we demonstrate the effectiveness of the proposed framework. Our method achieves over 90 percent accuracy on a 5-category task without real data, and 88 percent accuracy on a 10-category task when the user collects only one piece of data. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
27. Unsupervised Decomposition and Correction Network for Low-Light Image Enhancement.
- Author
-
Jiang, Qiuping, Mao, Yudong, Cong, Runmin, Ren, Wenqi, Huang, Chao, and Shao, Feng
- Abstract
Vision-based intelligent driving assistance systems and transportation systems can be improved by enhancing the visibility of the scenes captured in extremely challenging conditions. In particular, many low-image image enhancement (LIE) algorithms have been proposed to facilitate such applications in low-light conditions. While deep learning-based methods have achieved substantial success in this field, most of them require paired training data, which is difficult to be collected. This paper advocates a novel Unsupervised Decomposition and Correction Network (UDCN) for LIE without depending on paired data for training. Inspired by the Retinex model, our method first decomposes images into illumination and reflectance components with an image decomposition network (IDN). Then, the decomposed illumination is processed by an illumination correction network (ICN) and fused with the reflectance to generate a primary enhanced result. In contrast with fully supervised learning approaches, UDCN is an unsupervised one which is trained only with low-light images and corresponding histogram equalized (HE) counterparts (can be derived from the low-light image itself) as input. Both the decomposition and correction networks are optimized under the guidance of hybrid no-reference quality-aware losses and inter-consistency constraints between the low-light image and its HE counterpart. In addition, we also utilize an unsupervised noise removal network (NRN) to remove the noise previously hidden in the darkness for further improving the primary result. Qualitative and quantitative comparison results are reported to demonstrate the efficacy of UDCN and its superiority over several representative alternatives in the literature. The results and code will be made public available at https://github.com/myd945/UDCN. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
28. Crowdsourcing-based application to solve the problem of insufficient training data in deep learning-based classification of satellite images.
- Author
-
Saralioglu, Ekrem and Gungor, Oguz
- Subjects
- *
REMOTE-sensing images , *PROBLEM solving , *REMOTE sensing , *CROWDSOURCING , *CLASSIFICATION - Abstract
In order to solve insufficient training data problem in remote sensing, a web platform was created so that registered users can generate labeled data for various classes in a dynamic structure. Users were asked to select representative pixel groups for the forest, hazelnut, shadow, soil, tea, and building classes with the polygon tool, and then assign a class label corresponding to each created polygon thanks to the help document displaying descriptive information regarding the locations, colors, textures and distributions of the classes in the image. Crowdsourcing was again used to test the accuracy of the tagged data produced by crowdsourcing. The created data set was overlaid with the original WV-2 image, and the correctness of the labels of the polygons was once visually verified. Finally, the WV-2 image, consisting of 40 patches, was classified with CNN and an average of over 95% accuracy was achieved. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
29. Localized Statistical Shape Models for Large-Scale Problems With Few Training Data.
- Author
-
Wilms, Matthias, Ehrhardt, Jan, and Forkert, Nils D.
- Subjects
- *
DEEP learning , *STATISTICAL models , *LOCALIZATION (Mathematics) , *MULTILEVEL models , *DATA augmentation , *BIG data , *COVARIANCE matrices - Abstract
Objective: Statistical shape models have been successfully used in numerous biomedical image analysis applications where prior shape information is helpful such as organ segmentation or data augmentation when training deep learning models. However, training such models requires large data sets, which are often not available and, hence, shape models frequently fail to represent local details of unseen shapes. This work introduces a kernel-based method to alleviate this problem via so-called model localization. It is specifically designed to be used in large-scale shape modeling scenarios like deep learning data augmentation and fits seamlessly into the classical shape modeling framework. Method: Relying on recent advances in multi-level shape model localization via distance-based covariance matrix manipulations and Grassmannian-based level fusion, this work proposes a novel and computationally efficient kernel-based localization technique. Moreover, a novel way to improve the specificity of such models via normalizing flow-based density estimation is presented. Results: The method is evaluated on the publicly available JSRT/SCR chest X-ray and IXI brain data sets. The results confirm the effectiveness of the kernelized formulation and also highlight the models’ improved specificity when utilizing the proposed density estimation method. Conclusion: This work shows that flexible and specific shape models from few training samples can be generated in a computationally efficient way by combining ideas from kernel theory and normalizing flows. Significance: The proposed method together with its publicly available implementation allows to build shape models from few training samples directly usable for applications like data augmentation. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
30. Network Delay Measurement with Machine Learning: From Lab to Real-World Deployment.
- Author
-
Mohammed, Shady A., Shirmohammadi, Shervin, and Alchalabi, Alaa Eddin
- Abstract
Artificial Intelligence (AI) continues to impact all facets of technology including Instrumentation and Measurement (I&M) with much effort spent on developing I&M systems assisted by machine learning (ML), especially deep learning [1]. While these ML-assisted I&M systems show promising results in a lab environment, there is always the question of how well they will perform in the real world. In fact, concerns about the real-world performance of ML is not exclusive to I&M but an inherent property of ML in general, because ML is data driven and its performance will change if the data distribution changes in the real world. In this article, we present a case study of developing in the lab an ML-assisted I&M system, specifically a network delay predictor, and deploying it in the real world, achieving 93% accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
31. Phishing Detection Leveraging Machine Learning and Deep Learning: A Review.
- Author
-
Divakaran, Dinil Mon and Oest, Adam
- Abstract
Phishing attacks trick victims into disclosing sensitive information. To counter them, we explore machine learning and deep learning models leveraging large-scale data. We discuss models built on different kinds of data and present multiple deployment options to detect phishing attacks. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
32. Dielectric Breast Phantoms by Generative Adversarial Network.
- Author
-
Shao, Wenyi and Zhou, Beibei
- Subjects
- *
GENERATIVE adversarial networks , *BREAST , *MICROWAVE imaging , *DIELECTRICS , *BREAST imaging , *MACHINE learning - Abstract
In order to conduct the research of machine learning (ML)-based microwave breast imaging (MBI), a large number of digital dielectric breast phantoms that can be used as training data (ground truth) are required but are difficult to be achieved from practice. Although a few dielectric breast phantoms have been developed for research purpose, the number and the diversity are limited and are far inadequate to develop a robust ML algorithm for MBI. This article presents a neural network method to generate 2-D virtual breast phantoms that are similar to the real ones, which can be used to develop ML-based MBI in the future. The generated phantoms are similar but are different from those used in training. Each phantom consists of several images with each representing the distribution of a dielectric parameter in the breast map. A statistical analysis was performed over 10 000 generated phantoms to investigate the performance of the generative network. With the generative network, one may generate an unlimited number of breast images with more variations, so the ML-based MBI will be more ready to deploy. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
33. Deep Learning Seismic Inversion Based on Prestack Waveform Datasets.
- Author
-
Zhang, Jian, Sun, Hui, Zhang, Gan, and Zhao, Xiaoyan
- Subjects
- *
DEEP learning , *THEORY of wave motion , *INVERSE problems , *ANALYTICAL solutions , *TRAINING needs , *TRANSMISSION of sound - Abstract
Prediction of elastic parameters (e.g., P-, S-wave velocity, and density) from observed seismic data is one of the most common means of reservoir characterization. Recently, deep learning (DL), as a data-driven approach, has been attracting increasing interest in seismic inversion. DL is proven to have the potential to learn complex systems and solve inverse problems efficiently. One of the most key components of DL is the training dataset, and an effective training dataset is a prerequisite for the success of DL-based methods. In seismic inversion, the training dataset needs to be artificially expanded due to the limited number of actual training data pairs. Traditional approaches of using the exact Zoeppritz equation (EZE) or its approximations for training dataset construction have limitations, principally, the single interface assumption and the neglect of wave propagation effects. Alternatively, the analytical solution of the 1-D wave equation (i.e., reflectivity method [RM]) can simulate the full wave, including transmission losses and internal multiples, and can be executed in a target-oriented manner. Inspired by this, we develop a data-driven elastic parameter prediction method based on waveform formulation. The method uses RM to construct training dataset, which both compensates for the inadequate training dataset in data-driven seismic inversion and improves the accuracy of the inversion results. We implement the method in a synthetic model as well as field data. The results are compared with model-driven methods (EZE and RM) and data-driven method based on EZE, and it is shown that the proposed method outperforms these three methods. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
34. Attribute Graph Neural Networks for Strict Cold Start Recommendation.
- Author
-
Qian, Tieyun, Liang, Yile, Li, Qing, and Xiong, Hui
- Subjects
- *
RECOMMENDER systems , *MATRIX decomposition , *DEEP learning , *LOGIC circuits , *GRAPH algorithms , *SOCIAL networks , *FACTOR structure - Abstract
Rating prediction is a classic problem underlying recommender systems. It is traditionally tackled with matrix factorization. Recently, deep learning based methods, especially graph neural networks, have made impressive progress on this problem. Despite their effectiveness, existing methods focus on modeling the user-item interaction graph. The inherent drawback of such methods is that their performance is bound to the density of the interactions, which is however usually of high sparsity. More importantly, for a strict cold start user/item that neither appears in the training data nor has any interactions in the test stage, such methods are unable to learn the preference embedding of the user/item since there is no link to this user/item in the graph. In this work, we develop a novel framework Attribute Graph Neural Networks (AGNN) by exploiting the attribute graph rather than the commonly used interaction graph. This leads to the capability of learning embeddings for the strict cold start users/items. Our AGNN can produce the preference embedding for a strict cold user/item by learning on the distribution of attributes with an extended variational auto-encoder (eVAE) structure. Moreover, we propose a new graph neural network variant, i.e., gated-GNN, to effectively aggregate various attributes of different modalities in a neighborhood. Empirical results on three real-world datasets demonstrate that our model yields significant improvements for strict cold start recommendations and outperforms or matches the state-of-the-art performance in the warm start scenario. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
35. Vehicular Trajectory Classification and Traffic Anomaly Detection in Videos Using a Hybrid CNN-VAE Architecture.
- Author
-
Kumaran Santhosh, Kelathodi, Dogra, Debi Prosad, Roy, Partha Pratim, and Mitra, Adway
- Abstract
Visual surveillance has become indispensable in the evolution of Intelligent Transportation Systems (ITS). Video object trajectories are key to many of the visual surveillance applications. Classifying varying length time series data such as video object trajectories using conventional neural networks, can be challenging. In this paper, we propose trajectory classification and anomaly detection using a hybrid Convolutional Neural Network (CNN) and Variational Autoencoder (VAE) architecture. First, we introduce a high level features for varying length object trajectories using color gradient representation. In the next stage, a semi-supervised way to annotate moving object trajectories extracted using Temporally Incremental Gravitational Model (TIGM) is used for class labeling. For training, anomalous trajectories are identified using t-Distributed Stochastic Neighbor Embedding (t-SNE). Finally, a hybrid CNN-VAE architecture has been proposed for trajectory classification and anomaly detection. The results obtained using publicly available surveillance video datasets reveal that the proposed method can successfully identify traffic anomalies such as violations in lane driving, sudden speed variations, abrupt termination of vehicle during movement, and vehicles moving in wrong directions. The accuracy of trajectory classification improves by a margin of 1-6% against popular neural networks-based classifiers across various datasets using the proposed high-level features. The gradient representation also improves the anomaly detection accuracy significantly (30-35%). Code and dataset can be found at https://github.com/santhoshkelathodi/CNN-VAE. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
36. Variational Few-Shot Learning for Microservice-Oriented Intrusion Detection in Distributed Industrial IoT.
- Author
-
Liang, Wei, Hu, Yiyong, Zhou, Xiaokang, Pan, Yi, and I-Kai Wang, Kevin
- Abstract
Along with the popularity of the Internet of Things (IoT) techniques with several computational paradigms, such as cloud and edge computing, microservice has been viewed as a promising architecture in large-scale application design and deployment. Due to the limited computing ability of edge devices in distributed IoT, only a small scale of data can be used for model training. In addition, most of the machine-learning-based intrusion detection methods are insufficient when dealing with imbalanced dataset under limited computing resources. In this article, we propose an optimized intra/inter-class-structure-based variational few-shot learning (OICS-VFSL) model to overcome a specific out-of-distribution problem in imbalanced learning, and to improve the microservice-oriented intrusion detection in distributed IoT systems. Following a newly designed VFSL framework, an intra/inter-class optimization scheme is developed using reconstructed feature embeddings, in which the intra-class distance is optimized based on the approximation during a variation Bayesian process, while the inter-class distance is optimized based on the maximization of similarities during a feature concatenation process. An intelligent intrusion detection algorithm is, then, introduced to improve the multiclass classification via a nonlinear neural network. Evaluation experiments are conducted using two public datasets to demonstrate the effectiveness of our proposed model, especially in detecting novel attacks with extremely imbalanced data, compared with four baseline methods. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
37. Airport and Ship Target Detection on Satellite Images Based on YOLO V3 Network
- Author
-
Ying, Ren, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Jiming, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Hirche, Sandra, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Möller, Sebastian, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zhang, Junjie James, Series Editor, Wang, Liheng, editor, Wu, Yirong, editor, and Gong, Jianya, editor
- Published
- 2020
- Full Text
- View/download PDF
38. Map of Land Cover Agreement: Ensambling Existing Datasets for Large-Scale Training Data Provision
- Author
-
Gorica Bratic, Daniele Oxoli, and Maria Antonia Brovelli
- Subjects
training data ,high-resolution land cover ,global land cover ,machine learning ,deep learning ,satellite image classification ,Science - Abstract
Land cover information plays a critical role in supporting sustainable development and informed decision-making. Recent advancements in satellite data accessibility, computing power, and satellite technologies have boosted large-extent high-resolution land cover mapping. However, retrieving a sufficient amount of reliable training data for the production of such land cover maps is typically a demanding task, especially using modern deep learning classification techniques that require larger training sample sizes compared to traditional machine learning methods. In view of the above, this study developed a new benchmark dataset called the Map of Land Cover Agreement (MOLCA). MOLCA was created by integrating multiple existing high-resolution land cover datasets through a consensus-based approach. Covering Sub-Saharan Africa, the Amazon, and Siberia, this dataset encompasses approximately 117 billion 10m pixels across three macro-regions. The MOLCA legend aligns with most of the global high-resolution datasets and consists of nine distinct land cover classes. Noteworthy advantages of MOLCA include a higher number of pixels as well as coverage for typically underrepresented regions in terms of training data availability. With an estimated overall accuracy of 96%, MOLCA holds great potential as a valuable resource for the production of future high-resolution land cover maps.
- Published
- 2023
- Full Text
- View/download PDF
39. Stochastic Mirror Descent on Overparameterized Nonlinear Models.
- Author
-
Azizan, Navid, Lale, Sahin, and Hassibi, Babak
- Subjects
- *
MACHINE learning , *MIRRORS , *LEARNING problems , *DEEP learning - Abstract
Most modern learning problems are highly overparameterized, i.e., have many more model parameters than the number of training data points. As a result, the training loss may have infinitely many global minima (parameter vectors that perfectly “interpolate” the training data). It is therefore imperative to understand which interpolating solutions we converge to, how they depend on the initialization and learning algorithm, and whether they yield different test errors. In this article, we study these questions for the family of stochastic mirror descent (SMD) algorithms, of which stochastic gradient descent (SGD) is a special case. Recently, it has been shown that for overparameterized linear models, SMD converges to the closest global minimum to the initialization point, where closeness is in terms of the Bregman divergence corresponding to the potential function of the mirror descent. With appropriate initialization, this yields convergence to the minimum-potential interpolating solution, a phenomenon referred to as implicit regularization. On the theory side, we show that for sufficiently-overparameterized nonlinear models, SMD with a (small enough) fixed step size converges to a global minimum that is “very close” (in Bregman divergence) to the minimum-potential interpolating solution, thus attaining approximate implicit regularization. On the empirical side, our experiments on the MNIST and CIFAR-10 datasets consistently confirm that the above phenomenon occurs in practical scenarios. They further indicate a clear difference in the generalization performances of different SMD algorithms: experiments on the CIFAR-10 dataset with different regularizers, $\ell _{1}$ to encourage sparsity, $\ell _{2}$ (SGD) to encourage small Euclidean norm, and $\ell _{\infty }$ to discourage large components, surprisingly show that the $\ell _{\infty }$ norm consistently yields better generalization performance than SGD, which in turn generalizes better than the $\ell _{1}$ norm. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
40. Consistent Meta-Regularization for Better Meta-Knowledge in Few-Shot Learning.
- Author
-
Tian, Pinzhuo, Li, Wenbin, and Gao, Yang
- Subjects
- *
MACHINE learning , *DEEP learning , *TECHNOLOGICAL innovations - Abstract
Recently, meta-learning provides a powerful paradigm to deal with the few-shot learning problem. However, existing meta-learning approaches ignore the prior fact that good meta-knowledge should alleviate the data inconsistency between training and test data, caused by the extremely limited data, in each few-shot learning task. Moreover, legitimately utilizing the prior understanding of meta-knowledge can lead us to design an efficient method to improve the meta-learning model. Under this circumstance, we consider the data inconsistency from the distribution perspective, making it convenient to bring in the prior fact, and propose a new consistent meta-regularization (Con-MetaReg) to help the meta-learning model learn how to reduce the data-distribution discrepancy between the training and test data. In this way, the ability of meta-knowledge on keeping the training and test data consistent is enhanced, and the performance of the meta-learning model can be further improved. The extensive analyses and experiments demonstrate that our method can indeed improve the performances of different meta-learning models in few-shot regression, classification, and fine-grained classification. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
41. A Sensitivity-Based Data Augmentation Framework for Model Predictive Control Policy Approximation.
- Author
-
Krishnamoorthy, Dinesh
- Subjects
- *
DATA augmentation , *SUPERVISED learning , *PREDICTION models , *APPROXIMATION algorithms , *DEEP learning - Abstract
Approximating model predictive control (MPC) policy using expert-based supervised learning techniques requires labeled training datasets sampled from the MPC policy. This is typically obtained by sampling the feasible state space and evaluating the control law by solving the numerical optimization problem offline for each sample. Although the resulting approximate policy can be cheaply evaluated online, generating large training samples to learn the MPC policy can be time-consuming and prohibitively expensive. This is one of the fundamental bottlenecks that limit the design and implementation of MPC policy approximation. This technical article aims to address this challenge, and proposes a novel sensitivity-based data augmentation scheme for direct policy approximation. The proposed approach is based on exploiting the parametric sensitivities to cheaply generate additional training samples in the neighborhood of the existing samples. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
42. Unsupervised Deep Background Matting Using Deep Matte Prior.
- Author
-
Xu, Yong, Liu, Baoling, Quan, Yuhui, and Ji, Hui
- Subjects
- *
DEEP learning , *SUPERVISED learning , *CONVOLUTIONAL neural networks , *VIDEO editing - Abstract
Background matting is a recently developed image matting approach, with applications to image and video editing. It refers to estimating both the alpha matte and foreground from a pair of images with and without foreground objects. Recent work has applied deep learning to background matting, with very promising performance achieved. However, existing deep models are supervised which require a large dataset with ground truth alpha mattes for training. To avoid the cost of data collection and possible bias in training data, this paper proposes a dataset-free unsupervised deep learning-based approach for background matting. Observing that the local smoothness of alpha matte can be well characterized by the untrained network prior called deep matte prior, we model the foreground and alpha matte using the priors encoded by two generative convolutional neural networks. To avoid possible overfitting during unsupervised learning, a two-stage learning scheme is developed which contains projection-based training and Bayesian post refinement. An alpha-matte-driven initialization scheme is also developed for performance boost. Even without calling external training data, the proposed approach provides competitive performance to recent supervised learning-based methods in the experiments. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
43. Regularizing Deep Networks With Semantic Data Augmentation.
- Author
-
Wang, Yulin, Huang, Gao, Song, Shiji, Pan, Xuran, Xia, Yitong, and Wu, Cheng
- Subjects
- *
DATA augmentation , *SUPERVISED learning , *SEMANTICS , *DEEP learning - Abstract
Data augmentation is widely known as a simple yet surprisingly effective technique for regularizing deep networks. Conventional data augmentation schemes, e.g., flipping, translation or rotation, are low-level, data-independent and class-agnostic operations, leading to limited diversity for augmented samples. To this end, we propose a novel semantic data augmentation algorithm to complement traditional approaches. The proposed method is inspired by the intriguing property that deep networks are effective in learning linearized features, i.e., certain directions in the deep feature space correspond to meaningful semantic transformations, e.g., changing the background or view angle of an object. Based on this observation, translating training samples along many such directions in the feature space can effectively augment the dataset for more diversity. To implement this idea, we first introduce a sampling based method to obtain semantically meaningful directions efficiently. Then, an upper bound of the expected cross-entropy (CE) loss on the augmented training set is derived by assuming the number of augmented samples goes to infinity, yielding a highly efficient algorithm. In fact, we show that the proposed implicit semantic data augmentation (ISDA) algorithm amounts to minimizing a novel robust CE loss, which adds minimal extra computational cost to a normal training procedure. In addition to supervised learning, ISDA can be applied to semi-supervised learning tasks under the consistency regularization framework, where ISDA amounts to minimizing the upper bound of the expected KL-divergence between the augmented features and the original features. Although being simple, ISDA consistently improves the generalization performance of popular deep models (e.g., ResNets and DenseNets) on a variety of datasets, i.e., CIFAR-10, CIFAR-100, SVHN, ImageNet, and Cityscapes. Code for reproducing our results is available at https://github.com/blackfeather-wang/ISDA-for-Deep-Networks. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
44. Uncertainty Quantification for Deep Learning in Ultrasonic Crack Characterization.
- Author
-
Pyle, Richard J., Hughes, Robert R., Ali, Amine Ait Si, and Wilcox, Paul D.
- Subjects
- *
DEEP learning , *CONVOLUTIONAL neural networks , *BUILDING inspection , *SURFACE defects , *NONDESTRUCTIVE testing - Abstract
Deep learning for nondestructive evaluation (NDE) has received a lot of attention in recent years for its potential ability to provide human level data analysis. However, little research into quantifying the uncertainty of its predictions has been done. Uncertainty quantification (UQ) is essential for qualifying NDE inspections and building trust in their predictions. Therefore, this article aims to demonstrate how UQ can best be achieved for deep learning in the context of crack sizing for inline pipe inspection. A convolutional neural network architecture is used to size surface breaking defects from plane wave imaging (PWI) images with two modern UQ methods: deep ensembles and Monte Carlo dropout. The network is trained using PWI images of surface breaking defects simulated with a hybrid finite element / ray-based model. Successful UQ is judged by calibration and anomaly detection, which refer to whether in-domain model error is proportional to uncertainty and if out of training domain data is assigned high uncertainty. Calibration is tested using simulated and experimental images of surface breaking cracks, while anomaly detection is tested using experimental side-drilled holes and simulated embedded cracks. Monte Carlo dropout demonstrates poor uncertainty quantification with little separation between in and out-of-distribution data and a weak linear fit ($R=0.84$) between experimental root-mean-square-error and uncertainty. Deep ensembles improve upon Monte Carlo dropout in both calibration ($R=0.95$) and anomaly detection. Adding spectral normalization and residual connections to deep ensembles slightly improves calibration ($R=0.98$) and significantly improves the reliability of assigning high uncertainty to out-of-distribution samples. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
45. HQ2CL: A High-Quality Class Center Learning System for Deep Face Recognition.
- Author
-
Lv, Xianwei, Yu, Chen, Jin, Hai, and Liu, Kai
- Subjects
- *
DEEP learning , *FACE perception , *INSTRUCTIONAL systems - Abstract
Benefited from the proposals of function losses margin-based, face recognition has achieved significant improvements in recent years. Those losses aim to increase the margin between the different identities to enhance the discriminability. Ideally, the class center of different identities is far from each other, and face samples are compact around the corresponding class center. Hence, it’s very vital to produce a high-quality class center. However, the distribution of training sets determines the class center. With low-quality samples being in the majority, the class center would be close to the samples with little identity information. As a result, it would impair the discriminability of the learned model for those unseen samples. In this work, we propose a High-Quality Class Center Learning system (HQ2CL). This is an effective system and guides the class center to approach the high-quality samples to keep the discriminability. Specifically, HQ2CL introduces a quality-aware scale and margin layer for the identification loss and constructs a new high-quality center loss. We implement the proposed system without additional burden. And we present the experimental evaluation over different face benchmarks. The experimental results show the superiority of our proposed HQ2CL over the state-of-the-arts. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
46. Data Acquisition and Preparation for Dual-Reference Deep Learning of Image Super-Resolution.
- Author
-
Guo, Yanhui, Wu, Xiaolin, and Shu, Xiao
- Subjects
- *
DEEP learning , *HIGH resolution imaging , *ACQUISITION of data , *PIXELS , *SUPERVISED learning , *ELECTRONIC data processing - Abstract
The performance of deep learning based image super-resolution (SR) methods depend on how accurately the paired low and high resolution images for training characterize the sampling process of real cameras. Low and high resolution (LR $\sim $ HR) image pairs synthesized by degradation models (e.g., bicubic downsampling) deviate from those in reality; thus the synthetically-trained DCNN SR models work disappointingly when being applied to real-world images. To address this issue, we propose a novel data acquisition process to shoot a large set of LR $\sim $ HR image pairs using real cameras. The images are displayed on an ultra-high quality screen and captured at different resolutions. The resulting LR $\sim $ HR image pairs can be aligned at very high sub-pixel precision by a novel spatial-frequency dual-domain registration method, and hence they provide more appropriate training data for the learning task of super-resolution. Moreover, the captured HR image and the original digital image offer dual references to strengthen supervised learning. Experimental results show that training a super-resolution DCNN by our LR $\sim $ HR dataset achieves higher image quality than training it by other datasets in the literature. Moreover, the proposed screen-capturing data collection process can be automated; it can be carried out for any target camera with ease and low cost, offering a practical way of tailoring the training of a DCNN SR model separately to each of the given cameras. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
47. MetaLabelNet: Learning to Generate Soft-Labels From Noisy-Labels.
- Author
-
Algan, Gorkem and Ulusoy, Ilkay
- Subjects
- *
ARTIFICIAL neural networks , *MACHINE learning , *MULTILAYER perceptrons , *WIDE gap semiconductors - Abstract
Real-world datasets commonly have noisy labels, which negatively affects the performance of deep neural networks (DNNs). In order to address this problem, we propose a label noise robust learning algorithm, in which the base classifier is trained on soft-labels that are produced according to a meta-objective. In each iteration, before conventional training, the meta-training loop updates soft-labels so that resulting gradients updates on the base classifier would yield minimum loss on meta-data. Soft-labels are generated from extracted features of data instances, and the mapping function is learned by a single layer perceptron (SLP) network, which is called MetaLabelNet. Following, base classifier is trained by using these generated soft-labels. These iterations are repeated for each batch of training data. Our algorithm uses a small amount of clean data as meta-data, which can be obtained effortlessly for many cases. We perform extensive experiments on benchmark datasets with both synthetic and real-world noises. Results show that our approach outperforms existing baselines. The source code of the proposed model is available at https://github.com/gorkemalgan/MetaLabelNet. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
48. Spot-Adaptive Knowledge Distillation.
- Author
-
Song, Jie, Chen, Ying, Ye, Jingwen, and Song, Mingli
- Subjects
- *
ARTIFICIAL neural networks , *DISTILLATION - Abstract
Knowledge distillation (KD) has become a well established paradigm for compressing deep neural networks. The typical way of conducting knowledge distillation is to train the student network under the supervision of the teacher network to harness the knowledge at one or multiple spots (i.e., layers) in the teacher network. The distillation spots, once specified, will not change for all the training samples, throughout the whole distillation process. In this work, we argue that distillation spots should be adaptive to training samples and distillation epochs. We thus propose a new distillation strategy, termed spot-adaptive KD (SAKD), to adaptively determine the distillation spots in the teacher network per sample, at every training iteration during the whole distillation period. As SAKD actually focuses on “where to distill” instead of “what to distill” that is widely investigated by most existing works, it can be seamlessly integrated into existing distillation methods to further improve their performance. Extensive experiments with 10 state-of-the-art distillers are conducted to demonstrate the effectiveness of SAKD for improving their distillation performance, under both homogeneous and heterogeneous distillation settings. Code is available at https://github.com/zju-vipa/spot-adaptive-pytorch. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
49. Collaborative Refining for Person Re-Identification With Label Noise.
- Author
-
Ye, Mang, Li, He, Du, Bo, Shen, Jianbing, Shao, Ling, and Hoi, Steven C. H.
- Subjects
- *
IMAGE recognition (Computer vision) , *NOISE , *ARCHITECTURAL design , *PERFORMANCE standards , *NOISE measurement , *STATIC random access memory - Abstract
Existing person re-identification (Re-ID) methods usually rely heavily on large-scale thoroughly annotated training data. However, label noise is unavoidable due to inaccurate person detection results or annotation errors in real scenes. It is extremely challenging to learn a robust Re-ID model with label noise since each identity has very limited annotated training samples. To avoid fitting to the noisy labels, we propose to learn a prefatory model using a large learning rate at the early stage with a self-label refining strategy, in which the labels and network are jointly optimized. To further enhance the robustness, we introduce an online co-refining (CORE) framework with dynamic mutual learning, where networks and label predictions are online optimized collaboratively by distilling the knowledge from other peer networks. Moreover, it also reduces the negative impact of noisy labels using a favorable selective consistency strategy. CORE has two primary advantages: it is robust to different noise types and unknown noise ratios; it can be easily trained without much additional effort on the architecture design. Extensive experiments on Re-ID and image classification demonstrate that CORE outperforms its counterparts by a large margin under both practical and simulated noise settings. Notably, it also improves the state-of-the-art unsupervised Re-ID performance under standard settings. Code is available at https://github.com/mangye16/ReID-Label-Noise. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
50. CartoonLossGAN: Learning Surface and Coloring of Images for Cartoonization.
- Author
-
Dong, Yongsheng, Tan, Wei, Tao, Dacheng, Zheng, Lintao, and Li, Xuelong
- Subjects
- *
GENERATIVE adversarial networks , *ARTISTIC style , *IMAGE color analysis , *IMAGE processing - Abstract
Cartoonization as a special type of artistic style transfer is a difficult image processing task. The current existing artistic style transfer methods cannot generate satisfactory cartoon-style images due to that artistic style images often have delicate strokes and rich hierarchical color changes while cartoon-style images have smooth surfaces without obvious color changes, and sharp edges. To this end, we propose a cartoon loss based generative adversarial network (CartoonLossGAN) for cartoonization. Particularly, we first reuse the encoder part of the discriminator to build a compact generative adversarial network (GAN) based cartoonization architecture. Then we propose a novel cartoon loss function for the architecture. It can imitate the process of sketching to learn the smooth surface of the cartoon image, and imitate the coloring process to learn the coloring of the cartoon image. Furthermore, we also propose an initialization strategy, which is used in the scenario of reusing the discriminator to make our model training easier and more stable. Extensive experimental results demonstrate that our proposed CartoonLossGAN can generate fantastic cartoon-style images, and outperforms four representative methods. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.