1,652 results
Search Results
2. Special issue on selected and extended papers from the 2015 International Conference on Intelligence Science and Big Data Engineering (IScIDE 2015).
- Author
-
Shan, Shiguang, Chang, Hong, Cai, Deng, and Deng, Cheng
- Subjects
- *
IMAGE processing , *COMPUTER vision , *MACHINE learning - Published
- 2017
- Full Text
- View/download PDF
3. Special issue: Advances in artificial neural networks, machine learning and computational intelligenceSelected papers from the 23rd European Symposium on Artificial Neural Networks (ESANN 2015).
- Author
-
Aiolli, Fabio, Bunte, Kerstin, Hérault, Romain, and Kanevski, Mikhail
- Subjects
- *
ARTIFICIAL neural networks , *MACHINE learning , *COMPUTATIONAL intelligence , *CONFERENCES & conventions , *ARTIFICIAL intelligence - Published
- 2016
- Full Text
- View/download PDF
4. Joint coupled representation and homogeneous reconstruction for multi-resolution small sample face recognition.
- Author
-
Fan, Xiaojin, Liao, Mengmeng, Xue, Jingfeng, Wu, Hao, Jin, Lei, Zhao, Jian, and Zhu, Liehuang
- Subjects
- *
MACHINE learning , *FRACTIONAL programming , *FACE perception , *SHOOTING equipment , *LEARNING - Abstract
• This paper proposes a novel multivariate dictionary learning framework. • A coherence enhancement term to improve the coherent representing of the coding coefficients under different resolutions. • A multivariate dictionary optimization method to solve dictionaries involving the calculation of fractional norm. • The proposed method achieves the state-of-the-art performance on several benchmark datasets. Off-the-shelf dictionary learning algorithms have achieved satisfactory results in small sample face recognition applications. However, the achieved results depend on the facial images obtained at a single resolution. In practice, the resolution of the images captured on the same target is different because of the different shooting equipment and different shooting distances. These images of the same category at different resolutions will pose a great challenge to these algorithms. In this paper, we propose a Joint Coupled Representation and Homogeneous Reconstruction (JCRHR) for multi-resolution small sample face recognition. In JCRHR, an analysis dictionary is introduced and combined with the synthetic dictionary for coupled representation learning, which better reveals the relationship between coding coefficients and samples. In addition, a coherence enhancement term is proposed to improve the coherent representation of the coding coefficients at different resolutions, which facilitates the reconstruction of the sample by its homogeneous atoms. Moreover, each sample at different resolutions is assigned a different coding coefficient in the multi-dictionary learning process, so that the learned dictionary is more in line with the actual situation. Furthermore, a regularization term based on the fractional norm is drawn into the dictionary coupled learning to remove the redundant information in the dictionary, which can reduce the negative impacts of the redundant information. Comprehensive results demonstrate that the proposed JCRHR method achieves better results than the state-of-the-art methods, on several small sample face databases. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
5. A survey for solving mixed integer programming via machine learning.
- Author
-
Zhang, Jiayi, Liu, Chang, Li, Xijun, Zhen, Hui-Ling, Yuan, Mingxuan, Li, Yawen, and Yan, Junchi
- Subjects
- *
MACHINE learning , *INTEGER programming , *COMBINATORIAL optimization , *HEURISTIC algorithms , *NP-hard problems , *MACHINE theory , *PROBLEM solving - Abstract
Machine learning (ML) has been recently introduced to solving optimization problems, especially for combinatorial optimization (CO) tasks. In this paper, we survey the trend of leveraging ML to solve the mixed-integer programming problem (MIP). Theoretically, MIP is an NP-hard problem, and most CO problems can be formulated as MIP. Like other CO problems, the human-designed heuristic algorithms for MIP rely on good initial solutions and cost a lot of computational resources. Therefore, researchers consider applying machine learning methods to solve MIP since ML-enhanced approaches can provide the solution based on the typical patterns from the training data. Specifically, we first introduce the formulation and preliminaries of MIP and representative traditional solvers. Then, we show the integration of machine learning and MIP with detailed discussions on related learning-based methods, which can be further classified into exact and heuristic algorithms. Finally, we propose the outlook for learning-based MIP solvers, the direction toward more combinatorial optimization problems beyond MIP, and the mutual embrace of traditional solvers and ML components. We maintain a list of papers that utilize machine learning technologies to solve combinatorial optimization problems, which is available at https://github.com/Thinklab-SJTU/awesome-ml4co. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
6. Deep learning for Covid-19 forecasting: State-of-the-art review.
- Author
-
Kamalov, Firuz, Rajab, Khairan, Cherukuri, Aswani Kumar, Elnagar, Ashraf, and Safaraliev, Murodbek
- Subjects
- *
DEEP learning , *COVID-19 , *COVID-19 pandemic , *FORECASTING , *MACHINE learning , *QUALITY control - Abstract
• The paper fills the gap in the literature by reviewing and analyzing the current studies that apply deep learning for Covid-19 forecasting. • The initial search identified 152 studies of which 53 passed the quality control. • The existing literature is categorized using a model-based taxonomy. • The description of the models along with their performance evaluation is presented. • Recommendations for future improvements are provided. The Covid-19 pandemic has galvanized scientists to apply machine learning methods to help combat the crisis. Despite the significant amount of research there exists no comprehensive survey devoted specifically to examining deep learning methods for Covid-19 forecasting. In this paper, we fill the gap in the literature by reviewing and analyzing the current studies that use deep learning for Covid-19 forecasting. In our review, all published papers and preprints, discoverable through Google Scholar, for the period from Apr 1, 2020 to Feb 20, 2022 which describe deep learning approaches to forecasting Covid-19 were considered. Our search identified 152 studies, of which 53 passed the initial quality screening and were included in our survey. We propose a model-based taxonomy to categorize the literature. We describe each model and highlight its performance. Finally, the deficiencies of the existing approaches are identified and the necessary improvements for future research are elucidated. The study provides a gateway for researchers who are interested in forecasting Covid-19 using deep learning. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
7. Robust learning of Huber loss under weak conditional moment.
- Author
-
Huang, Shouyou
- Subjects
- *
STATISTICAL learning , *MACHINE learning - Abstract
In this paper, we study the performance of robust learning with Huber loss. As an alternative to traditional empirical risk minimization schemes, Huber regression has been extensively used in machine learning. A new comparison theorem is established in the paper, which characterizes the gap between the excess generalization error and the prediction error. In addition, we refine the error bounds from the perspective of statistical learning theory and improve the convergence rates in the presence of heavy-tailed noise. It is worth mentioning that a new moment condition E [ | Y | 1 + ∊ | X = x ] ∈ L ρ X 2 is employed in analysis of error bound and learning rates from a theoretical viewpoint. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
8. Neural-based fixed-time composite learning control for multiagent systems with intermittent faults.
- Author
-
Zheng, Xiaohong, Ren, Hongru, Zhou, Qi, and Wang, Xinzhong
- Subjects
- *
MACHINE learning , *FAULT-tolerant control systems , *CLOSED loop systems , *NONLINEAR functions , *NONLINEAR systems - Abstract
In this paper, a distributed fixed-time composite learning control problem is addressed for nonlinear multiagent systems (MASs) subject to intermittent actuator faults. First, a distributed estimator is constructed for followers that are unable to communicate directly with the leader. Then, instead of using the traditional adaptive neural network (NN) algorithm, a predictor-based composite learning technique is proposed, which incorporates the prediction error into the NN update law to enhance the estimation accuracy of the unknown nonlinearity. Furthermore, an adaptive fault-tolerant control compensation mechanism is developed for intermittent faults that may occur indefinitely and frequently. To guarantee that all signals of the closed-loop system are bounded in fixed time, a nonsingular fixed-time fault-tolerant controller in the form of quadratic function is established. Finally, simulation results confirm the effectiveness of the presented algorithm. • This paper presents a singularity-free fixed-time NN algorithm for nonlinear MASs, and a composite learning algorithm is established to improve the approximation accuracy of nonlinear functions by introducing a prediction error into NN update law. • For followers without access to the leader, a local estimator is utilized to estimate the leader information. Therefore, the present control method avoids the emergence of coupling terms between agents during the controller design. • This paper considers intermittent actuator faults that may occur indefinitely and frequently, posing significant challenges. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. A review of research on reinforcement learning algorithms for multi-agents.
- Author
-
Hu, Kai, Li, Mingyang, Song, Zhiqiang, Xu, Keer, Xia, Qingfeng, Sun, Ning, Zhou, Peng, and Xia, Min
- Subjects
- *
MACHINE learning , *REWARD (Psychology) , *ARTIFICIAL intelligence , *LITERATURE reviews , *MULTIAGENT systems , *REINFORCEMENT learning - Abstract
In recent years, multi-agent reinforcement learning techniques have been widely used and evolved in the field of artificial intelligence. However, traditional reinforcement learning methods have limitations such as long training time, large sample data requirements, and highly delayed rewards. Therefore, this paper systematically and specifically studies the MARL algorithm. Firstly, this paper uses Citespace software to visually analyze the existing literature on multi-agent reinforcement learning and briefly indicates the research hotspots and key research directions in this field. Secondly, the applications of traditional reinforcement learning algorithms under two task objects, namely single-agent and multi-agent systems, are described in detail. Then, the paper highlights the diverse applications, challenges, and corresponding solutions of MARL algorithmic techniques in the field of MAS. Finally, the paper points out future research directions based on the existing limitations of the algorithm. Through this paper, readers will gain a systematic and in-depth understanding of MARL algorithms and how they can be utilized to better address the various challenges posed by MAS. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. Exploiting indirect linear correlation for label distribution learning.
- Author
-
Yu, Peiqiu and Jia, Xiuyi
- Subjects
- *
MACHINE learning , *MATRIX decomposition , *LABEL design , *ALGORITHMS , *HYPOTHESIS - Abstract
Label distribution learning represents the relevance of labels to samples using description degree, which can provide richer semantic information, thus finding wider applications. Exploiting label correlations is an effective approach to narrow down the hypothesis space of label distribution learning models. In existing works that utilize low-rank assumptions or label linear dependence to mine correlations, it is assumed that a label can be linearly expressed by other labels. However, this assumption can only be satisfied when there are linear dependency relationships between labels, thus the label correlation obtained by such methods is subject to certain distortion. To address this issue, this paper assumes that labels can be linearly represented by the same set of bases. The correlation between labels is represented by sharing common bases. Specifically, the paper employs matrix factorization to extract bases that can be used to represent all labels. And then designs a label distribution learning algorithm based on the property of sharing the same set of bases of the ground truth label distribution and predict label distribution. The effectiveness of the algorithm is verified through experimental validation. Generally speaking, the algorithm presented in this paper achieves optimal performance at 73.15% of the cases, with the best average ranking. In the two-tailed t -test, the algorithm in this paper exhibits statistical superiority compared to all comparison algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. Efficient hyperspectral image segmentation for biosecurity scanning using knowledge distillation from multi-head teacher.
- Author
-
Phan, Minh Hieu, Phung, Son Lam, Luu, Khoa, and Bouzerdoum, Abdesselam
- Subjects
- *
MACHINE learning , *BIOSECURITY , *IMAGE segmentation , *TEACHERS' assistants , *TEACHERS , *PERFORMANCE standards - Abstract
Foreign species can deteriorate the environment and the economy of a country. To automatically monitor biosecurity threats at country borders, this paper investigates compact deep networks for accurate and real-time object segmentation for hyperspectral images. To this end, knowledge distillation (KD) approaches compress the model by distilling the knowledge of a large teacher network to a compact student network. However, when the student is over-compressed, the performance of standard KD methods degrades significantly due to the large capacity gap between the teacher and the student. This gap can be addressed by adding medium-sized teacher assistants, but training them incurs significant computation and hence is impractical. To address this problem, this paper proposes a new framework called Knowledge Distillation from Multi-head Teacher (KDM), which distills the knowledge of a multi-head teacher to the student. By encapsulating multiple teachers in a single network, our proposed KDM assists the learning of a very compact student and significantly reduces the training time. We also introduce Bio-HSI, a new large benchmark hyperspectral image dataset of 3,125 high-resolution images with dense segmentation ground truth. This new, large dataset can be expected to advance research on deep models for hyperspectral image segmentation. Evaluated on this dataset, the student trained via our KDM has 762 times fewer parameters than the state-of-the-art segmentation model (i.e., HRNet), while achieving competitive accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
12. A deep reinforcement learning based method for real-time path planning and dynamic obstacle avoidance.
- Author
-
Chen, Pengzhan, Pei, Jiean, Lu, Weiqing, and Li, Mingzhen
- Subjects
- *
REWARD (Psychology) , *MACHINE learning , *REINFORCEMENT learning , *STATISTICAL sampling - Abstract
In a dynamic environment, the moving obstacle makes the path planning of the manipulator very difficult. Therefore, this paper proposes a path planning with dynamic obstacle avoidance method of the manipulator based on a deep reinforcement learning algorithm soft actor-critic (SAC). To avoid the moving obstacle in the environment and make real-time planning, we design a comprehensive reward function of dynamic obstacle avoidance and target approach. Aiming at the problem of low sample utilization caused by random sampling, in this paper, prioritized experience replay (PER) is employed to change the weight of samples, and then improve the sampling efficiency. In addition, we carry out the simulation experiment and give the results. The result shows that this method can effectively avoid moving obstacles in the environment, and complete the planning task with a high success rate. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
13. Perspective view of autonomous control in unknown environment: Dual control for exploitation and exploration vs reinforcement learning.
- Author
-
Chen, Wen-Hua
- Subjects
- *
REINFORCEMENT learning , *MACHINE learning , *PROBLEM solving , *DECISION making , *ACTIVE learning - Abstract
This paper overviews and discusses the relationship between Reinforcement Learning (RL) and the recently developed Dual Control for Exploitation and Exploration (DCEE). It is argued that there are two related but quite distinctive approaches, namely, control and machine learning, in tackling intractability arising in optimal decision making/control problems. In the control approach, the original problems (of an infinite horizon) are approximated by finite horizon problems and solved online by taking advantage of the availability of computing power. In the machine learning approach, the optimal solutions are approximated through iterations, or (offline) training through trials when models are not available. When dealing with unknown environments, DCEE as a technique developed from the control approach could potentially solve similar problems as RL while offering a number of advantages, most notably, coping with uncertainty in environment/tasks, high efficiency in learning through balancing exploitation and exploration, and potential in establishing its formal properties like stability. The links between DCEE and other relevant methods like dual control, Model Predictive Control and particularly Active Inference in neuroscience are discussed. The latter provides a strong biological endorsement for DCEE. The methods and discussions are illustrated by autonomous source search using a robot. It is concluded that DCEE provides a promising, complementary approach to RL, and more research is required to develop it as a generic theory and fully realise its potential. The relationships revealed in this paper provide insights into these relevant methods and facilitate cross fertilisation between control, machine learning and neuroscience for developing autonomous control under uncertain environments. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
14. EvoSTGAT: Evolving spatiotemporal graph attention networks for pedestrian trajectory prediction.
- Author
-
Tang, Haowen, Wei, Ping, Li, Jiapeng, and Zheng, Nanning
- Subjects
- *
PEDESTRIANS , *VIDEO surveillance , *SOCIAL influence , *INFORMATION modeling , *MACHINE learning , *SOCIAL interaction - Abstract
[Display omitted] • This paper addresses the problem of pedestrian trajectory prediction, which plays significant roles in many applications such as human-robot cooperation and video surveillance. • It proposes an evolving spatiotemporal graph attention network model, which employs a time-varying graph convolution to extract time-varying features and an evolving attention mechanism to describe the recursive temporal interactions. • It improves the pedestrian trajectory predicting results and proves the strength of the model with ablation studies. Predicting pedestrian trajectory is an essential task in many applications. While previous studies based on graphs seek to model spatiotemporal information among pedestrian interactions, most of them neglect the recursive and continuous relations between neighboring time points. In this paper, we propose an evolving spatiotemporal graph attention network to predict future trajectories of pedestrians. This model considers the evolving relations of social interactions between contiguous time points and uses coordinates. The interaction is modeled by an evolving and dynamic attention mechanism. The social influence of each pedestrians of current frame is evolved from that of last frame and will be utilized to generate the social influence of next frame. The proposed model was tested on two challenging datasets and the experimental results prove the strength of the model. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
15. Analysis methods of coronary artery intravascular images: A review.
- Author
-
Huang, Chenxi, Wang, Jian, Xie, Qiang, and Zhang, Yu-Dong
- Subjects
- *
DEEP learning , *CORONARY arteries , *HEART disease diagnosis , *INTRAVASCULAR ultrasonography , *OPTICAL coherence tomography , *IMAGE analysis , *MACHINE learning - Abstract
Coronary artery disease is among one of the diseases human suffer most. Intravascular coronary arterial image analysis consists of denoising, segmentation, detection, and three-dimensional reconstruction, having a significant meaning for auxiliary diagnosis and treatment of coronary artery disease. Intravascular ultrasound (IVUS) and intravascular optical coherence tomography (IVOCT) are the two most commonly applied intravascular coronary arterial imaging techniques. Based on these fundamental imaging techniques, in recent years, many advanced technologies from traditional machine learning algorithms to deep learning methods were employed in the analysis of intravascular coronary arterial images and made huge progress in this field. In this survey, we reviewed more than one hundred papers published in top journals or conferences such as Neural Networks and MICCAI. These papers proposed approaches or schemes for the intravascular coronary arterial image analysis, including lumen border segmentation, atherosclerotic plaque characterization, media-adventitia segmentation, stent strut detection, and three-dimensional reconstruction. Our survey began with introducing coronary artery intravascular imaging techniques, essential neural networks, and deep learning and then presented an across-the-board review of methods, applications, and trends of intravascular image analysis. This survey is more comprehensive than other articles not only for its scope and reference number but also for discussing the future direction in this field. Compared to other review papers in this field, this article could assist beginners in constructing a basic knowledge frame of coronary artery intravascular image analysis methods and brought state-of-the-art progress in this field to fellow researchers. We hope this paper could benefit either the beginners for coronary arterial image analysis or experienced researchers. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
16. A review of regression and classification techniques for analysis of common and rare variants and gene-environmental factors.
- Author
-
Miller, Anthony, Panneerselvam, John, and Liu, Lu
- Subjects
- *
GENOME-wide association studies , *GENOTYPE-environment interaction , *MACHINE learning , *TYPE 1 diabetes , *GENOMICS - Abstract
Statistical techniques incorporated with machine-learning algorithms in unison with gene-environment interaction are giving unparalleled understanding of complex diseases. Accurate analysis and intricate capturing of common, rare, and low MAF (Minor Allele Frequency) variants alongside gene-environmental interaction is pivotal whilst concluding reliable and accurate classification of complex diseases. Various complex diseases including genres of diabetes Type 1 and Type 2 alongside the vastly under-researched Lada (Latent Autoimmune Diabetes in Adults) diabetes require further investigation alongside significant machine learning research to gain a deeper understanding of the disease complexities. Despite existing efforts, an ideal combination of statistical techniques with optimal machine-learning algorithms that can accurately capture and model the gene-environment interaction is lacking. Intentionally exploring future and simultaneously exploiting modern-day computational methods in genomic analysis, this paper profoundly investigates both the future and present interaction of statistical analysis techniques and machine-learning algorithms and Ensembles with gene-environmental factors. In this context, this paper firstly presents a conceptual understanding of genomic conventions; secondly, conducts potential future machine learning algorithms alongside an extensive analysis of a range of classification, regression and Ensemble techniques along with exhibiting their imperative relationship and roles in investigating and classifying common, rare variants and a wide array of gene-environmental factors; and thirdly, utilisation of statistical techniques in Genome Wide Association Studies is scrutinised whilst analysing common, rare and MAF variants. As an important contribution, this paper identifies efficient machine-learning algorithms alongside Ensemble models and future potential analysis techniques and exhibits their inherent characteristics that can enhance the reliability and accuracy of the gene-environment classification analysis. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
17. Noise/fault aware regularization for incremental learning in extreme learning machines.
- Author
-
Wong, Hiu-Tung, Leung, Ho-Chun, Leung, Chi-Sing, and Wong, Eric
- Subjects
- *
MACHINE learning , *NOISE , *NONLINEAR regression , *FAULT tolerance (Engineering) , *NONLINEAR equations , *FAULT-tolerant computing , *FAULT diagnosis - Abstract
• This paper develops a noise/fault aware training objective for incremental ELM. • This paper uses two representative algorithms to develop two noise aware ELM. • The two proposed algorithms are much better than existing ones. • The multiple set concept can further enhance the performance. • We can make other non-noise tolerant algorithms to be noise tolerant. This paper investigates noise/fault tolerant incremental algorithms for the extreme learning machine (ELM) concept. Existing incremental ELM algorithms can be classified into two approaches: non-recomputation and recomputation. This paper first formulates a noise/fault aware objective function for nonlinear regression problems. Instead of developing noise/fault aware algorithms for the two computational approaches in a one-by-one manner, this paper uses two representative incremental algorithms, namely incremental ELM (I-ELM) and error minimized ELM (EM-ELM), to develop two noise/fault aware incremental algorithms. The proposed algorithms are called generalized I-ELM (GI-ELM) and generalized EM-ELM (GEM-ELM). The GI-ELM adds k hidden nodes into the existing network at each incremental step without recomputing the existing weights. To have a fair comparison, we consider a modified version of I-ELM as a comparison algorithm. The simulation demonstrates that the noise/fault tolerance of the proposed GI-ELM is better than that of the modified I-ELM. In the GEM-ELM, k hidden nodes are added into the existing network at each incremental step. Meanwhile, all output weights are recomputed based on a recursive formula. We also consider a modified version of EM-ELM as a comparison algorithm. The simulation demonstrates that the noise/fault tolerance of the proposed GEM-ELM is better than that of the modified EM-ELM. Moreover, we demonstrate that the multiple set concept can further enhance the performance of the two proposed algorithms. Following our research results, one can make some non-noise/fault tolerant incremental algorithms to be noise/fault tolerant. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
18. Neurocomputing for internet of things: Object recognition and detection strategy.
- Author
-
Qureshi, Kashif Naseer, Kaiwartya, Omprakash, Jeon, Gwanggil, and Piccialli, Francesco
- Subjects
- *
OBJECT recognition (Computer vision) , *INTERNET of things , *ARTIFICIAL intelligence , *SMART devices , *MACHINE learning - Abstract
Modern and new integrated technologies have changed the traditional systems by using more advanced machine learning, artificial intelligence methods, new generation standards, and smart and intelligent devices. The new integrated networks like the Internet of Things (IoT) and 5G standards offer various benefits and services. However, these networks have suffered from multiple object detection, localization, and classification issues. Conventional Neural Networks (CNN) and their variants have been adopted for object detection, classification, and localization in IoT networks to create autonomous devices to make decisions and perform tasks without human intervention and helpful to learn in-depth features. Motivated by these facts, this paper investigates existing object detection and recognition techniques by using CNN models used in IoT networks. This paper presents a Conventional Neural Networks for 5G-Enabled Internet of Things Network (CNN-5GIoT) model for moving and static objects in IoT networks after a detailed comparison. The proposed model is evaluated with existing models to check the accuracy of real-time tracking. The proposed model is more efficient for real-time object detection and recognition than conventional methods. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
19. Simulation-based evaluation of model-free reinforcement learning algorithms for quadcopter attitude control and trajectory tracking.
- Author
-
Caffyn Yuste, Pablo, Iglesias Martínez, José Antonio, and Sanchis de Miguel, María Araceli
- Subjects
- *
MACHINE learning , *REINFORCEMENT learning , *SELECTION (Plant breeding) , *ALGORITHMS , *SEEDS - Abstract
General use quadcopters have been under development for over a decade but many of their potential applications are still under evaluation and have not yet been adopted in many of the areas that could benefit from their use. While the current generation of quadcopters use a mature set of control algorithms, the next steps, especially as autonomous features are developed, should involve a more complex learning capability to be able to adapt to unknown circumstances in a safe and reliable way. This paper provides baseline quadcopter control models learnt using eight general reinforcement learning (RL) algorithms in a simulated environment, with the object of establishing a reference performance, both in terms of precision and generation cost, for a simple set of trajectories. Each algorithm uses a tailored set of hyperparameters while, additionally, the influence of random seeds is also studied. While not all algorithms converge in the allocated computing budget, the more complex ones are able to provide stable and precise control models. This paper recommends the use of the TD3 algorithm as a reference for comparison with new RL algorithms. Additional guidance for future work is provided based on the weaknesses identified in the learning process, especially regarding the strong dependence of agent performance on random seeds. • Quadcopter control models learnt using model-free reinforcement learning algorithms. • Trained and tested in a simulated environment using OpenAI gym and pybullet. • Trained using small displacements, tested on trajectories unknown to the agent. • TD3 algorithm shows the most consistent performance. • All algorithms show strong dependence on the random seed selection. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
20. Toward cross-subject and cross-session generalization in EEG-based emotion recognition: Systematic review, taxonomy, and methods.
- Author
-
Apicella, Andrea, Arpaia, Pasquale, D'Errico, Giovanni, Marocco, Davide, Mastrati, Giovanna, Moccaldi, Nicola, and Prevete, Roberto
- Subjects
- *
EMOTION recognition , *FEATURE extraction , *GENERALIZATION , *DATABASE searching , *ELECTROENCEPHALOGRAPHY - Abstract
A systematic review on machine-learning strategies for improving generalization in electroencephalography-based emotion classification was realized. In particular, cross-subject and cross-session generalization was focused. In this context, the non-stationarity of electroencephalographic (EEG) signals is a critical issue and can lead to the Dataset Shift problem. Several architectures and methods have been proposed to address this issue, mainly based on transfer learning methods. In this review, 449 papers were retrieved from the Scopus , IEEE Xplore and PubMed databases through a search query focusing on modern machine learning techniques for generalization in EEG-based emotion assessment. Among these papers, 79 were found eligible based on their relevance to the problem. Studies lacking a specific cross-subject or cross-session validation strategy, or making use of other biosignals as support were excluded. On the basis of the selected papers' analysis, a taxonomy of the studies employing Machine Learning (ML) methods was proposed, together with a brief discussion of the different ML approaches involved. The studies reporting the best results in terms of average classification accuracy were identified, supporting that transfer learning methods seem to perform better than other approaches. A discussion is proposed on the impact of (i) the emotion theoretical models and (ii) psychological screening of the experimental sample on the classifier performances. [Display omitted] • The non-stationarity of EEG signals can lead to the Dataset Shift problem. • Transfer learning methods improve generalizability in EEG-based emotion classification. • Adaptive feature extraction also in combination with transfer learning are promising for generalization. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
21. Differentially private stochastic gradient descent with low-noise.
- Author
-
Wang, Puyu, Lei, Yunwen, Ying, Yiming, and Zhou, Ding-Xuan
- Subjects
- *
MACHINE learning , *CONVEX sets , *PRIVACY - Abstract
Modern machine learning algorithms aim to extract fine-grained information from data to provide accurate predictions, which often conflicts with the goal of privacy protection. This paper addresses the practical and theoretical importance of developing privacy-preserving machine learning algorithms that ensure good performance while preserving privacy. In this paper, we focus on the privacy and utility (measured by excess risk bounds) performances of differentially private stochastic gradient descent (SGD) algorithms in the setting of stochastic convex optimization. Specifically, we examine the pointwise problem in the low-noise setting for which we derive sharper excess risk bounds for the differentially private SGD algorithm. In the pairwise learning setting, we propose a simple differentially private SGD algorithm based on gradient perturbation. Furthermore, we develop novel utility bounds for the proposed algorithm, proving that it achieves optimal excess risk rates even for non-smooth losses. Notably, we establish fast learning rates for privacy-preserving pairwise learning under the low-noise condition, which is the first of its kind. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. Progressive expansion: Cost-efficient medical image analysis model with reversed once-for-all network training paradigm.
- Author
-
Lim, Shin Wei, Chan, Chee Seng, Mohd Faizal, Erma Rahayu, and Ewe, Kok Howg
- Subjects
- *
COMPUTER-assisted image analysis (Medicine) , *IMAGE analysis , *DIAGNOSTIC imaging , *ARTIFICIAL intelligence , *IMAGE segmentation , *HIPPOCAMPUS (Brain) - Abstract
Low computational cost artificial intelligence (AI) models are vital in promoting the accessibility of real-time medical services in underdeveloped areas. The recent Once-For-All (OFA) network (without retraining) can directly produce a set of sub-network designs with Progressive Shrinking (PS) algorithm; however, the training resource and time inefficiency downfalls are apparent in this method. In this paper, we propose a new OFA training algorithm, namely the Progressive Expansion (ProX) to train the medical image analysis model. It is a reversed paradigm to PS, where technically we train the OFA network from the minimum configuration and gradually expand the training to support larger configurations. Empirical results showed that the proposed paradigm could reduce training time up to 68%; while still being able to produce sub-networks that have either similar or better accuracy compared to those trained with OFA-PS on ROCT (classification), BRATS and Hippocampus (3D-segmentation) public medical datasets. The code implementation for this paper is accessible at: https://github.com/shin-wl/ProX-OFA. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. A comprehensive survey and taxonomy on privacy-preserving deep learning.
- Author
-
Tran, Anh-Tu, Luong, The-Dung, and Huynh, Van-Nam
- Subjects
- *
DEEP learning , *NATURAL language processing , *DATA privacy , *IMAGE recognition (Computer vision) , *MACHINE learning , *AUTOMATIC speech recognition - Abstract
Deep learning (DL) has been shown to be very effective for many application domains of machine learning (ML), including image classification, voice recognition, natural language processing, and bioinformatics. The success of DL techniques is directly related to the availability of large amounts of training data. However, in many cases, the data are sensitive to the users and should be protected to preserve the privacy. Privacy-preserving deep learning (PPDL) has thus become a very active research field to ensure the training process and use of DL models are productive without exposing or leaking information about the data. This paper aims to provide a comprehensive survey of PPDL. We concentrate on the risks that affect data privacy in DL and conduct a detailed investigation into the models that ensure privacy. Finally, we propose a set of evaluation criteria, detailing the advantages and disadvantages of the solutions. Based on the analyzed strengths and weaknesses, the paper has highlighted some important research problems and application cases that have not been studied and these point to certain open research directions. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
24. What-Where-When Attention Network for video-based person re-identification.
- Author
-
Zhang, Chenrui, Chen, Ping, Lei, Tao, Wu, Yangxu, and Meng, Hongying
- Subjects
- *
VIDEO surveillance , *MACHINE learning , *TIME-varying networks - Abstract
Video-based person re-identification plays a critical role in intelligent video surveillance by learning temporal correlations from consecutive video frames. Most existing methods aim to solve the challenging variations of pose, occlusion, backgrounds and so on by using attention mechanism. They almost all draw attention to the occlusion and learn occlusion-invariant video representations by abandoning the occluded area or frames, while the other areas in these frames contain sufficient spatial information and temporal cues. To overcome these drawbacks, this paper proposes a comprehensive attention mechanism covering what , where , and when to pay attention in the discriminative spatial-temporal feature learning, namely What-Where-When Attention Network (W3AN). Concretely, W3AN designs a spatial attention module to focus on pedestrian identity and obvious attributes by the importance estimating layer (What and Where), and a temporal attention module to calculate the frame-level importance (when), which is embedded into a graph attention network to exploit temporal attention features rather than computing weighted average feature for video frames like existing methods. Moreover, the experiments on three widely-recognized datasets demonstrate the effectiveness of our proposed W3AN model and the discussion of major modules elaborates the contributions of this paper. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
25. A method of traffic police detection based on attention mechanism in natural scene.
- Author
-
Zheng, Ying, Bao, Hong, Meng, Chaochao, and Ma, Nan
- Subjects
- *
TRAFFIC police , *TRAFFIC monitoring , *POLICE training , *DRIVERLESS cars , *FRAMES (Social sciences) , *MACHINE learning - Abstract
The complex and varied urban road conditions have always been a difficult and a pivotal component in the study of driverless technology especially at intersections. In China, it is necessary for driverless cars to understand the gestures of traffic police. To identify the traffic police gesture at the intersection, the key step is to detect the traffic police at the intersection. At present, the research on traffic police detection is still in its infancy, there exists common problems such as slow detection speed and other real time problems in this method, and there is not a standardized traffic police data set ether. For the real time problems, this paper introduces the attention mechanism, and proposes a new real-time detection method of traffic police based on attention mechanism. The method proposed in this paper has strong robustness and can quickly complete the target detection task. For the data set problem, this paper analyzes and discloses similar data sets published through research. In the meanwhile, this paper collects 24,530 video data on the actual road, and it extracts and saves 12,000 pictures containing traffic police at frame rate of 30FPS as traffic police detection training and validated data set. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
26. Convex formulation for multi-task L1-, L2-, and LS-SVMs.
- Author
-
Ruiz, Carlos, Alaíz, Carlos M., and Dorronsoro, José R.
- Subjects
- *
ERROR functions , *SUPPORT vector machines , *MACHINE learning - Abstract
Quite often a machine learning problem lends itself to be split in several well-defined subproblems, or tasks. The goal of Multi-Task Learning (MTL) is to leverage the joint learning of the problem from two different perspectives: on the one hand, a single, overall model, and on the other hand task-specific models. In this way, the found solution by MTL may be better than those of either the common or the task-specific models. Starting with the work of Evgeniou et al., support vector machines (SVMs) have lent themselves naturally to this approach. This paper proposes a convex formulation of MTL for the L1-, L2- and LS-SVM models that results in dual problems quite similar to the single-task ones, but with multi-task kernels; in turn, this makes possible to train the convex MTL models using standard solvers. As an alternative approach, the direct optimal combination of the already trained common and task-specific models can also be considered. In this paper, a procedure to compute the optimal combining parameter with respect to four different error functions is derived. As shown experimentally, the proposed convex MTL approach performs generally better than the alternative optimal convex combination, and both of them are better than the straight use of either common or task-specific models. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
27. Uncertainty quantification in extreme learning machine: Analytical developments, variance estimates and confidence intervals.
- Author
-
Guignard, Fabian, Amato, Federico, and Kanevski, Mikhail
- Subjects
- *
MACHINE learning , *UNCERTAINTY , *PYTHON programming language , *HETEROSCEDASTICITY , *CONFIDENCE intervals - Abstract
• Analytical developments support the understanding of ELM variability. • A special attention is paid to the impact of the input weight randomness. • Variance estimates for homoskedastic, heteroskedastic, regularized cases are given. • The possibility of constructing accurate confidence intervals is discussed. • A new Python library allows the computation of the proposed estimates. Uncertainty quantification is crucial to assess prediction quality of a machine learning model. In the case of Extreme Learning Machines (ELM), most methods proposed in the literature make strong assumptions on the data, ignore the randomness of input weights or neglect the bias contribution in confidence interval estimations. This paper presents novel estimations that overcome these constraints and improve the understanding of ELM variability. Analytical derivations are provided under general assumptions, supporting the identification and the interpretation of the contribution of different variability sources. Under both homoskedasticity and heteroskedasticity, several variance estimates are proposed, investigated, and numerically tested, showing their effectiveness in replicating the expected variance behaviours. Finally, the feasibility of confidence intervals estimation is discussed by adopting a critical approach, hence raising the awareness of ELM users concerning some of their pitfalls. The paper is accompanied with a scikit-learn compatible Python library enabling efficient computation of all estimates discussed herein. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
28. Intrusion detection approach based on optimised artificial neural network.
- Author
-
Choraś, Michał and Pawlicki, Marek
- Subjects
- *
ARTIFICIAL neural networks , *MACHINE learning , *BIOLOGICALLY inspired computing - Abstract
Intrusion Detection, the ability to detect malware and other attacks, is a crucial aspect to ensure cybersecurity. So is the ability to identify this myriad of attacks. Artificial Neural Networks (as well as other machine learning bio-inspired approaches) are an established and proven method of accurate classification. ANNs are extremely versatile – a wide range of setups can achieve significantly different classification results. The main objective and contribution of this paper is the evaluation of the way the hyperparameters can influence the final classification result. In this paper, a wide range of ANN setups is put to comparison. We have performed our experiments on two benchmark datasets, namely NSL-KDD and CICIDS2017. The most effective arrangement achieves the multi-class classification accuracy of 99.909% on an established benchmark dataset. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
29. Predicting energy cost of public buildings by artificial neural networks, CART, and random forest.
- Author
-
Zekić-Sušac, Marijana, Has, Adela, and Knežević, Marinela
- Subjects
- *
ARTIFICIAL neural networks , *RANDOM forest algorithms , *PUBLIC buildings , *CONSTRUCTION cost estimates , *REGRESSION trees , *MACHINE learning , *BUILDING repair - Abstract
• ANN, CART, and RF regression trees have shown the potential in modeling energy cost. • Three different strategies regarding variable selection were tested and compared. • Machine learning and Boruta method have produced the highest accuracy of prediction. • The model has extracted heating and occupational data as the most important. • The created model could be used to assess the concept of smart buildings and cities. The paper deals with modeling the cost of energy consumed in public buildings by leveraging three machine learning methods: artificial neural networks, CART, and random forest regression trees. Energy consumption is one of the major issues in global and national policies, therefore scientific efforts in creating prediction models of energy consumption and cost are highly important. One of the largest energy consumers in every state is its public sector, consisting of educational, health, public administration, military, and other types of public buildings. Recent technologies based on sensor networks and Big data platforms enable collection of large amounts of data that could be used to analyze energy consumption and cost. A real data from Croatian public sector is used in this paper including a large number of constructional, energetic, occupational, climate and other attributes. The algorithms for data pre-processing and modeling by optimizing parameters are suggested. Three strategies were tested: (1) with all available variables, (2) with a filter-based variable selection, and (3) with a wrapper-based variable selection which integrates Boruta algorithm and random forest. Prediction models of energy cost are created using two approaches: (a) comparative usage of artificial neural networks and two types of regression trees, CART and random forest, and (b) integration of RF-Boruta variable selection and machine learning methods for prediction. A cross-validation procedure was used to optimize the artificial neural network and regression tree topology, as well to select the most appropriate activation function. Along with creating a prediction model, the aim of the paper was also to extract the relevant predictors of energy cost in public buildings which are important in planning the construction or renovation of buildings. The results have shown that the second approach which integrates machine learning with Boruta method, where the random forest algorithm is used for both variable reduction and prediction modeling, has produced a higher accuracy of prediction than the individual usage of three machine learning methods. Such findings confirm the potential of hybrid machine learning methods which are suggested in previous research, but in favor of random forest method over CART and artificial neural networks. Regarding the variable selection, the model has extracted heating and occupational data as the most important, followed by constructional, cooling, electricity, and lighting attributes. The model could be implemented in public buildings information systems and their IoT networks within the concept of smart buildings and smart cities. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
30. Label-guided Attention Distillation for lane segmentation.
- Author
-
Liu, Zhikang and Zhu, Lanyun
- Subjects
- *
MACHINE learning , *CONVOLUTIONAL neural networks , *TEACHER training - Abstract
[Display omitted] • We propose a novel distillation approach, i.e., LGAD, which uses structure hints in lane labels to guide the attention of lane segmentation networks to better capture long-range textural information. • We carefully investigate the inner mechanism of LGAD, the considerations of choosing among different layer mimicking paths, and the optimization strategies. • Extensive experiments on two benchmark datasets demonstrate the effectiveness of the proposed LGAD on boosting the performance of lane segmentation networks. Contemporary segmentation methods are usually based on deep fully convolutional networks (FCNs). However, the layer-by-layer convolutions with a growing receptive field is not good at capturing long-range contexts such as lane markers in the scene. In this paper, we address this issue by designing a distillation method that exploits label structure when training segmentation network. The intuition is that the ground-truth lane annotations themselves exhibit internal structure. We broadcast the structure hints throughout a teacher network, i.e., we train a teacher network that consumes a lane label map as input and attempts to replicate it as output. Then, the attention maps of the teacher network are adopted as supervisors of the student segmentation network. The teacher network, with label structure information embedded, knows distinctly where the convolutional layers should pay visual attention into. The proposed method is named as Label-guided Attention Distillation (LGAD). It turns out that the student network learns significantly better with LGAD than when learning alone. As the teacher network is deprecated after training, our method does not increase the inference time. Note that LGAD can be easily incorporated in any lane segmentation network. To validate the effectiveness of the proposed LGAD method, extensive experiments have been conducted on two popular lane detection benchmarks: TuSimple and CULane. The results show consistent improvement across a variety of convolutional neural network architectures. Specifically, we demonstrate the accuracy boost of LGAD on the lightweight model ENet. It turns out that the ENet-LGAD surpasses existing lane segmentation algorithms. The main contributions of this paper include a newly proposed distillation training strategy (LGAD) and solid experimental investigation of the inner mechanism of LGAD. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
31. Active learning for road lane landmark inventory with V-ELM in highly uncontrolled image capture conditions.
- Author
-
Lopez-Guede, Jose Manuel, Izquierdo, Asier, Estevez, Julian, and Graña, Manuel
- Subjects
- *
ACTIVE learning , *MACHINE learning , *LABOR costs , *INVENTORIES , *RANDOM forest algorithms , *RADARSAT satellites - Abstract
Road landmark inventory is becoming an important data product for the maintenance of transport infrastructures. Several commercial sensors are available which include synchronized optical cameras that allowto build 360° panoramic images of the surroundings of the vehicle used for road inspection. This paper is devoted to the analysis of such panorama images,specifically the area that contains themost relevant information. Road lane landmark detection is posed as a two class classification problem that may be solved bymachine learningapproaches, such as Random Forest (RF) and ensembles of Extreme Learning Machines (V-ELM). Besides model parameter selection, a central problem is the construction of a labeled training and validation datasetto cope with the highly uncontrolled conditions of image capture. Besides, human labor cost makes image data labeling a very expensive process. This paper proposes an open ended Active Learning (AL) approach involving a human oraclein the loop who provides the data labeling and can trigger the AL process when detection quality is degraded by the change in imaging conditions. The paper reports encouraging results over a collection of sample images selected from an industrial road landmark inventory operation. As an additional contribution, this paper assesses the ability of AL to overcomesome of the issues raised by highly class imbalanced datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
32. Android malware detection through machine learning on kernel task structures.
- Author
-
Wang, Xinning and Li, Chong
- Subjects
- *
MACHINE learning , *MALWARE , *MALWARE prevention , *DATA structures , *SMARTPHONES , *DATA warehousing - Abstract
With the advent of smart phones, the popularity of free Android applications has risen rapidly. This has led to malicious Android apps being involuntarily installed, which violate the user privacy or conduct attack. Malware detection on Android platforms therefore is a growing concern because of the undesirable similarity between malicious behavior and benign behavior, which can lead to slow detection, and allow compromises to persist for comparatively long periods of time in infected phones. The contributions of this paper are first a multiple dimensional, kernel feature-based framework and feature weight-based detection (WBD) designed to categorize and comprehend the characteristics of Android malware and benign apps. Furthermore, our software agent is orchestrated and implemented for the data collection and storage to scan thousands of benign and malicious apps automatically. We examine 112 kernel attributes of executing the task data structure in the Android system and evaluate the detection accuracy with a number of datasets of various dimensions. We find that memory- and signal-related features contribute to more precise classification than schedule-related and other descriptors of task states listed in our paper. Particularly, memory-related features provide fine-grain classification policies for preserving higher classification precision than the signal-related and others. Furthermore, we study and evaluate 80 newly infected attributes of the Android kernel task structure, prioritizing the 70 features of most significance based on dimensional reduction to optimize the efficiency of high-dimensional classification. Our second contribution is that our experiments demonstrate that, as compared to existing techniques with a short list of task structure features (16 or 32 features), our method can achieve 94%-98% accuracy and 2%–7% false positive rate, while detecting malware apps with reduced-dimensional features that adequately abbreviate online malware detections and advance offline malware inspections. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
33. Predicting short-term next-active-object through visual attention and hand position.
- Author
-
Jiang, Jingjing, Nan, Zhixiong, Chen, Hui, Chen, Shitao, and Zheng, Nanning
- Subjects
- *
ARTIFICIAL neural networks , *HUMAN-robot interaction , *HAND , *DISTRIBUTION (Probability theory) , *MACHINE learning - Abstract
Human intention prediction is of great significance in many applications, such as human-robot interaction, intelligent rehabilitation robots. This paper studies the problem of short-term next-active-object prediction in egocentric images. The short-term next-active-object refers to the object that a human is going to interact with in the short-term future, which is an embodiment of human intention. Most current methods usually use object-centered cues, such as the deviation of object appearance change and the unique shape of the egocentric object trajectory, to predict the next-active-object. In this paper, inspired by the fact that human intention is also revealed by human-centered cues, we propose a deep neural network model that integrates the cues from visual attention and hand positions to predict the next-active-object. Firstly, the probability maps of visual attention and hand positions are constructed, and then the probability distribution of next-active-object is generated. We experimentally compare our method with several baseline methods using two datasets and confirm its effectiveness. In addition, ablation experiments are conducted, and crucial points concerning the next-active-object are discussed. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
34. Meta weight learning via model-agnostic meta-learning.
- Author
-
Xu, Zhixiong, Chen, Xiliang, Tang, Wei, Lai, Jun, and Cao, Lei
- Subjects
- *
MACHINE learning , *REINFORCEMENT learning , *WEIGHT measurement , *WEIGHTS & measures - Abstract
While meta learning approaches have achieved remarkable success, obtaining a stable and unbiased meta -learner remains a significant challenge, since the initial model of a meta -learner could be too biased towards existing tasks to adapt to new tasks. In order to avoid a biased meta -learner and improve its generalizability, this paper proposes a generic meta learning method that aims to learn an unbiased meta -learner towards a variety of tasks before its initial model is adapted to unseen tasks. Specifically, this paper presents a meta weight learning method for minimizing the inequality of performance across different training tasks. An end-to-end training approach is introduced for the proposed algorithm that allows for effectively learning weight and initializing the network model. Alternatively, a variety of measurement methods of weight is also designed to test the effectiveness of different weight learning methods on the improvement of model-agnostic meta -learning algorithm. The simulation results show that the proposed meta weight learning method not only outperforms state-of-the-art meta learning algorithms, but also is superior to other manually designed measurement methods of weight on discrete and continuous control problems. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
35. A survey on vulnerability of federated learning: A learning algorithm perspective.
- Author
-
Xie, Xianghua, Hu, Chen, Ren, Hanchi, and Deng, Jingjing
- Subjects
- *
FEDERATED learning , *MACHINE learning , *DEEP learning , *LITERATURE reviews , *BIBLIOGRAPHY - Abstract
Federated Learning (FL) has emerged as a powerful paradigm for training Machine Learning (ML), particularly Deep Learning (DL) models on multiple devices or servers while maintaining data localized at owners' sites. Without centralizing data, FL holds promise for scenarios where data integrity, privacy and security and are critical. However, this decentralized training process also opens up new avenues for opponents to launch unique attacks, where it has been becoming an urgent need to understand the vulnerabilities and corresponding defense mechanisms from a learning algorithm perspective. This review paper takes a comprehensive look at malicious attacks against FL, categorizing them from new perspectives on attack origins and targets, and providing insights into their methodology and impact. In this survey, we focus on threat models targeting the learning process of FL systems. Based on the source and target of the attack, we categorize existing threat models into four types, Data to Model (D2M), Model to Data (M2D), Model to Model (M2M) and composite attacks. For each attack type, we discuss the defense strategies proposed, highlighting their effectiveness, assumptions and potential areas for improvement. Defense strategies have evolved from using a singular metric to excluding malicious clients, to employing a multifaceted approach examining client models at various phases. In this survey paper, our research indicates that the to-learn data, the learning gradients, and the learned model at different stages all can be manipulated to initiate malicious attacks that range from undermining model performance, reconstructing private local data, and to inserting backdoors. We have also seen these threat are becoming more insidious. While earlier studies typically amplified malicious gradients, recent endeavors subtly alter the least significant weights in local models to bypass defense measures. This literature review provides a holistic understanding of the current FL threat landscape and highlights the importance of developing robust, efficient, and privacy-preserving defenses to ensure the safe and trusted adoption of FL in real-world applications. The categorized bibliography can be found at: https://github.com/Rand2AI/Awesome-Vulnerability-of-Federated-Learning. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
36. A survey of the recent trends in deep learning for literature based discovery in the biomedical domain.
- Author
-
Cesario, Eugenio, Comito, Carmela, and Zumpano, Ester
- Subjects
- *
DEEP learning , *NATURAL language processing , *LANGUAGE models , *SCIENTIFIC literature , *SCIENTIFIC knowledge , *MACHINE learning - Abstract
Every day, enormous amounts of biomedical texts discussing various biomedical topics are produced. Revealing strong semantic connections hidden in those unstructured data is essential for many interesting applications such as knowledge base development for the biomedical domain as well as drug repurposing and drug–disease associations. Literature based discovery (LBD) is a well-known paradigm that refers to the issues of finding new hidden knowledge in scientific literature by connecting pieces of semantically-related information belonging to independent documents. This challenging research area has been extensively investigated by the research community and different proposals adopting natural language processing, text mining, machine learning and recently deep learning have been developed. This paper exploits a very focused task, it surveys a collection of research papers published in the recent years that have adopted Deep Learning for literature based discovery as an effective technique to discover new relationships between existing knowledge in biomedical domain. The study provides an analysis of the key characteristics of each work surveyed, including the Literature based discovery application area, the deep learning method used, the type of analyzed data, and the results obtained. Recognizing the significance of Pre-trained Language Models (PLMs), another primary aim of this paper is to offer an extensive overview of the latest developments in pre-trained language models within the field of biomedicine. This focus will primarily be on how they are applied to downstream tasks associated with Literature-Based Discovery in the biomedical domain. Additionally, the survey highlights the key drawbacks of the current state-of-the-art proposals, as well as the challenges that require further study by the research community. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
37. Adaptive Regularized Warped Gradient Descent Enhances Model Generalization and Meta-learning for Few-shot Learning.
- Author
-
Rao, Shuzhen, Huang, Jun, and Tang, Zengming
- Subjects
- *
MACHINE learning - Abstract
Warped Gradient descent (WarpGrad) is a remarkable meta-learning method for gradient transformation by inserting warp-layers. However, the task-shared initialization provided by WarpGrad is difficult to be adaptive to each task. Moreover, transforming gradients with meta-learned warp-layers ignores the local geometric features or task-specific knowledge, and may lead to a significant risk of overfitting caused by the increase of parameters. In this paper, we propose ARWarpGrad to guarantee better generalization performance with faster convergence speed by modeling both the cross-task and task-specific knowledge. We introduce Initialization Modulation (IM) to meta-learn to initialize the task-learner specifically. Furthermore, the Mixed Gradient Preprocessing (MGP), which includes the Adaptive Learning Rates (ALR) and the Gaussian Momentum Dropout (GMD), is put forward to provide better adaptive optimization direction and length for task adaptation based on the feature of local geometries. In addition, Memory Regularization (MR) is provided to alleviate the overfitting problem effectively with the use of parameter memory. Ultimately, extensive experiments on three settings demonstrate that ARWarpGrad achieves state-of-the-art performance with convergence acceleration and overfitting prevention characteristics. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
38. LLP-AAE: Learning from label proportions with adversarial autoencoder.
- Author
-
Wang, Bo, Sun, Yingte, and Tong, Qiang
- Subjects
- *
MACHINE learning , *SUPERVISED learning , *DEEP learning - Abstract
This paper presents an effective weakly supervised learning algorithm LLP-AAE to leverage the adversarial autoencoder (AAE) for learning from label proportions (LLP), in which only the bag-level proportional information is available. Our LLP-AAE utilizes an autoencoder backbone and performs adversarial training in latent space to match the aggregated posterior distribution of hidden coding with the prior distributions. In this way, apart from the reconstruction task, the encoder is also dedicated to producing fake samples, in order to deceive discriminators as far as possible. Ultimately, the encoder is employed as a competent label predictor for unseen data. In addition to the LLP classifier, our model can also achieve controllable samples generation by feeding the decoder with gradually changing latent code, which is proven to be useful for a better LLP performance. We also provide a panoramic explanation for LLP-AAE by regarding the LLP problem as an alternative learning procedure between proportion-based pseudo label generation and discriminative reconstruction. Experiments on six benchmark image datasets demonstrate the advantage of our method both in style manipulation with the latent feature representation and comparable multi-class LLP performance with the state-of-the-art models. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
39. Land use and land cover classification with hyperspectral data: A comprehensive review of methods, challenges and future directions.
- Author
-
Moharram, Mohammed Abdulmajeed and Sundaram, Divya Meena
- Subjects
- *
ZONING , *GENERATIVE adversarial networks , *CONVOLUTIONAL neural networks , *LAND use , *RECURRENT neural networks , *DROUGHT management - Abstract
Recently, many efforts have been concentrated on land use land cover (LULC) classification due to rapid urbanization, environmental pollution, agriculture drought, frequent floods, and climate change. However, various aspects have attracted hyperspectral imaging due to there being informative discriminative features, such as spectral-spatial features. To this end, this paper is a comprehensive and systematic review of LULC classification using hyperspectral images by reviewing four significant research investigations. Moreover, the four investigations have addressed the following points: (1) the main components of the hyperspectral imaging, the modes of hyperspectral imaging with data acquisition methods, and the intrinsic differences between hyperspectral image and multispectral image, (2) the role of machine learning in LULC classification, and the standard deep learning methods: Convolution Neural Network (CNN), Stacked Autoencoder (SAE), Deep Belief Network (DBN), Recurrent Neural Network (RNN), and Generative Adversarial Network (GAN), (3) the standard benchmark hyperspectral datasets and the evaluation criteria, (4) the main challenges of LULC classification with the possible solutions for the limited training samples issue, the promising future directions, and finally the recent applications for LULC classification. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
40. Semi-supervised multiple evidence fusion for brain tumor segmentation.
- Author
-
Huang, Ling, Ruan, Su, and Denœux, Thierry
- Subjects
- *
SUPERVISED learning , *DEEP learning , *BRAIN tumors , *MACHINE learning , *DEMPSTER-Shafer theory - Abstract
The performance of deep learning-based methods depends mainly on the availability of large-scale labeled learning data. However, obtaining precisely annotated examples is challenging in the medical domain. Although some semi-supervised deep learning methods have been proposed to train models with fewer labels, only a few studies have focused on the uncertainty caused by the low quality of the images and the lack of annotations. This paper addresses the above issues using Dempster-Shafer theory and deep learning: 1) a semi-supervised learning algorithm is proposed based on an image transformation strategy; 2) a probabilistic deep neural network and an evidential neural network are used in parallel to provide two sources of segmentation evidence; 3) Dempster's rule is used to combine the two pieces of evidence and reach a final segmentation result. Results from a series of experiments on the BraTS2019 brain tumor dataset show that our framework achieves promising results when only some training data are labeled. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
41. The coming of age of interpretable and explainable machine learning models.
- Author
-
Lisboa, P.J.G., Saralajew, S., Vellido, A., Fernández-Domenech, R., and Villmann, T.
- Subjects
- *
MACHINE learning , *COMING of age , *DATA analysis , *UNIVERSITY research , *JUSTICE administration - Abstract
Machine-learning-based systems are now part of a wide array of real-world applications seamlessly embedded in the social realm. In the wake of this realization, strict legal regulations for these systems are currently being developed, addressing some of the risks they may pose. This is the coming of age of the concepts of interpretability and explainability in machine-learning-based data analysis, which can no longer be seen just as an academic research problem. In this paper, we discuss explainable and interpretable machine learning as post hoc and ante-hoc strategies to address regulatory restrictions and highlight several aspects related to them, including their evaluation and assessment and the legal boundaries of application. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
42. Distributional reinforcement learning with unconstrained monotonic neural networks.
- Author
-
Théate, Thibaut, Wehenkel, Antoine, Bolland, Adrien, Louppe, Gilles, and Ernst, Damien
- Subjects
- *
REINFORCEMENT learning , *DISTRIBUTION (Probability theory) , *MONOTONIC functions , *CONTINUOUS functions , *ARTIFICIAL intelligence , *MUSCLE weakness - Abstract
• Novel distributional RL algorithm based on unconstrained monotonic neural networks. • Monotonicity ensures the validity of the random return probability distribution. • Methodology to learn different representations of the random return distribution. • Empirical comparison of the probability metrics commonly used in distributional RL. • Critical approximation highlighted for the extensively used Wasserstein distance. The distributional reinforcement learning (RL) approach advocates for representing the complete probability distribution of the random return instead of only modelling its expectation. A distributional RL algorithm may be characterised by two main components, namely the representation of the distribution together with its parameterisation and the probability metric defining the loss. The present research work considers the unconstrained monotonic neural network (UMNN) architecture, a universal approximator of continuous monotonic functions which is particularly well suited for modelling different representations of a distribution. This property enables the efficient decoupling of the effect of the function approximator class from that of the probability metric. The research paper firstly introduces a methodology for learning different representations of the random return distribution (PDF, CDF and QF). Secondly, a novel distributional RL algorithm named unconstrained monotonic deep Q-network (UMDQN) is presented. To the authors' knowledge, it is the first distributional RL method supporting the learning of three , valid and continuous representations of the random return distribution. Lastly, in light of this new algorithm, an empirical comparison is performed between three probability quasi-metrics, namely the Kullback–Leibler divergence, Cramer distance, and Wasserstein distance. The results highlight the main strengths and weaknesses associated with each probability metric together with an important limitation of the Wasserstein distance. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
43. Introducing multi-dimensional hierarchical classification: Characterization, solving strategies and performance measures.
- Author
-
Montenegro, C., Santana, R., and Lozano, J.A.
- Subjects
- *
HIERARCHICAL Bayes model , *LEARNING strategies , *CLASSIFICATION - Abstract
Classification problems where there exist multiple class variables that need to be jointly predicted are known as Multi-dimensional classification problems. If the labels of these class variables are organized as hierarchies, we can take advantage of specific strategies designed for the Hierarchical classification paradigm. In this paper we present the Multi-dimensional hierarchical classification (MDHC) paradigm, a result of the combination of Multi-dimensional and Hierarchical classification paradigms. We propose four MDHC learning strategies which are designed to exploit the particularities of this new paradigm, combining characteristics of Multi-dimensional and Hierarchical classification strategies. Along with these strategies, we present a framework for classifier comparison in which we use a set of performance measures specifically designed for MDHC, and a procedure to create MDHC synthetic scenarios. Using this framework and the performance measures presented, we study how characteristics of the MDHC problems influence the performance of the different MDHC strategies proposed, and compare them to other non-MDHC strategies. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
44. An active memristor based rate-coded spiking neural network.
- Author
-
Amin Fida, Aabid, Khanday, Farooq A., and Mittal, Sparsh
- Subjects
- *
BIOLOGICAL neural networks , *MACHINE learning , *MEMRISTORS , *BOOLEAN functions , *ONLINE education - Abstract
• Physical behaviors of memristive systems can be related to the bio-physical dynamics of biological neural elements. • Spiking behaviors subject to input stimuli of LIF neurons made of memristive elements can be extrapolated to develop on-chip learning algorithms. • Rate coding is a viable alternative to temporal or population coding for in-hardware SNNs. • It is possible to perform a non-linear functions like XOR using a single neuron in SNNs. • A hybrid approach relying on ANN like gradient calculation can be used to learn in SNNs. Neuromorphic computing is a novel computing paradigm that aims to mimic the behavior of biological neural networks for efficiently solving complex problems. While CMOS based neurons and synapses have been developed, they are limited in their ability to demonstrate bio-realistic dynamics. This, coupled with the fact that a huge number of these individual devices are required to build neurons and synapses, limits the scaling and power efficiency of such systems. A viable answer to this problem is neuromemristive systems that are based on memristor devices. These devices exhibit physical behaviors that can be related to the bio-physical dynamics of synapses and neurons. In this paper, a rate-coded all memristive "spiking neural network" (SNN) is presented. The proposed SNN is built with an active memristor neuron based on vanadium dioxide (VO 2) coupled with a non-volatile memristor synapse. The results are validated by first simulating spiking versions of two Boolean functions viz., AND and XOR gates in SPICE. With features extracted from the small neural nets, a large-scale 3-layer spiking neural network is then simulated in Python which yields a validation accuracy of 87% on the MNIST dataset of handwritten digits. One of the prime features of this work is the realization of the XOR function using a single neuron which is not possible without the use of 2-layers of neurons in traditional neural networks. Another significant contribution is the utilization of a gradient-based learning approach for online training of a large-scale SNN. For this, we use the inherent activation function (Sigmoid/ReLU) of the proposed neuron design. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
45. Graph learning for latent-variable Gaussian graphical models under laplacian constraints.
- Author
-
Li, Ran, Lin, Jiming, Qiu, Hongbing, Zhang, Wenhui, and Wang, Junyi
- Subjects
- *
GRAPH theory , *LAPLACIAN matrices , *SPECTRAL theory , *LATENT variables , *SPARSE graphs , *SIGNAL processing , *MACHINE learning - Abstract
• The problem of graph Laplacian estimation with latent variables is formulated. • An multi-block ADMM algorithm is proposed to solve the problem. • The proposed method can estimate the conditional correlation of observed variables. In recent years, graph learning for smooth signals under Laplacian constraints has attracted increasing attention due to the wide application of graph Laplacian matrix in spectral graph theory, machine learning, and graph signal processing tasks. Standard graph learning methods usually assume that graphs are sparse, but the correlation between real-world entities is only sometimes sparse because of some common and potential effects. In this paper, we model these common effects as latent variables and assume that the Gaussian graphical model (GGM) under Laplacian constraints is conditionally sparse given latent variables but marginally non-sparse. Based on this assumption, the graph learning problem is formulated in a regularized maximum marginal likelihood (MML) framework with a sparse plus low-rank decomposition form. The specialized algorithm is developed to solve the proposed graph learning problem by incorporating Laplacian constraints into a multi-block alternating direction method of multipliers (ADMM) with proximal regularization terms. The experiments conducted on synthetic and real-world data sets demonstrate that the proposed graph learning method outperforms the standard method in inferring the sparsity pattern of the conditional graphical model of observed variables with the presence of latent variables. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
46. An efficient multi-metric learning method by partitioning the metric space.
- Author
-
Yuan, Chao and Yang, Liming
- Subjects
- *
SUPERVISED learning , *MACHINE learning , *LOGARITHMIC functions , *PATTERN recognition systems , *LEARNING - Abstract
Metric learning has attracted significant attention due to its high effectiveness and efficiency for pattern recognition task. Traditional supervised metric learning algorithms attempt to seek a global distance metric with labeled samples. When data are represented with multimodal and only limited supervision information is available, these approaches are insufficient to obtain satisfactory results. In this paper, we develop a robust semi-supervised multi-metric learning method (RSMM) to improve classification performance. The proposed RSMM learns multiple local metrics and a background metric instead of a single global metric. Specifically, we divide the metric space into influential regions and background region, and then regulate the effectiveness of each local metric to be within the related regions. Simultaneously, a geometrically interpretable, symmetric distance is defined with local metrics and background metric. Based on the resultant learning bounds, we obtain the regularization term to improve the classifier's generalization ability. Moreover, the manifold regularization term is introduced to preserve the supervision information as well as geometry structure. The substantial unlabeled samples may cause potential threats and large uncertainties, so the logarithmic loss function is utilized to enhance the robustness. An efficient gradient descent algorithm is exploited to solve the non-convex challenging problem. To further understand the proposed algorithm, we theoretically derive its robustness and generalization error bounds. Finally, numerical experiments on UCI datasets and image datasets demonstrate the feasibility and validity of the RSMM. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
47. Towards an ML-based semantic IoT for pandemic management: A survey of enabling technologies for COVID-19.
- Author
-
Zgheib, Rita, Chahbandarian, Ghazar, Kamalov, Firuz, Messiry, Haythem El, and Al-Gindy, Ahmed
- Subjects
- *
COVID-19 pandemic , *MACHINE learning , *PANDEMICS , *COVID-19 , *ARTIFICIAL intelligence , *DIGITAL technology - Abstract
The connection between humans and digital technologies has been documented extensively in the past decades but needs to be evaluated through the current global pandemic. Artificial Intelligence(AI), with its two strands, Machine Learning (ML) and Semantic Reasoning, has proven to be a great solution to provide efficient ways to prevent, diagnose and limit the spread of COVID-19. IoT solutions have been widely proposed for COVID-19 disease monitoring, infection geolocation, and social applications. In this paper, we investigate the usage of the three technologies for handling the COVID-19 pandemic. For this purpose, we surveyed the existing ML applications and algorithms proposed during the pandemic to detect COVID-19 disease using symptom factors and image processing. The survey includes existing approaches including semantic technologies and IoT systems for COVID-19. Based on the survey result, we classified the main challenges and the solutions that could solve them. The study proposes a conceptual framework for pandemic management and discusses challenges and trends for future research. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
48. Local Differential Privacy for data collection and analysis.
- Author
-
Wang, Teng, Zhao, Jun, Hu, Zhi, Yang, Xinyu, Ren, Xuebin, and Lam, Kwok-Yan
- Subjects
- *
ACQUISITION of data , *DATA analysis , *PRIVACY , *SIMPLE machines , *MACHINE learning , *FALSE discovery rate - Abstract
Local Differential Privacy (LDP) can provide each user with strong privacy guarantees under untrusted data curators while ensuring accurate statistics derived from privatized data. Due to its powerfulness, LDP has been widely adopted to protect privacy in various tasks (e.g., heavy hitters discovery, probability estimation) and systems (e.g., Google Chrome, Apple iOS). In particular, (∊ , δ) -LDP has been studied in related statistical tasks like private learning and hypothesis testing, but is mainly achieved by using Gaussian mechanism, leading to the limited data utility. In this paper, we investigate several novel mechanisms that achieve (∊ , δ) -LDP with higher data utility in collecting and analyzing users' data. Specifically, we first design two (∊ , δ) -LDP algorithms for mean estimations on multi-dimensional numeric data, which can ensure higher accuracy than the optimal Gaussian mechanism. Then, we investigate different local protocols for frequency estimations on categorical attributes under (∊ , δ) -LDP. Based on the proposed mechanisms, we further study on (∊ , δ) -LDP-compliant stochastic gradient descent algorithms for machine learning models. Besides, the theoretical analysis of the error bound and the variance of the proposed algorithms are also presented in the paper. We have conducted extensive experiments on both real-world and synthetic datasets and demonstrated the high data utility of our proposed algorithms in the perspectives of simple data statistics tasks and complex machine learning tasks. The experimental results have shown that our proposed algorithms can effectively improve the data utility in different tasks while alleviating the privacy concerns of each individual. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
49. Bidirectional heuristic search to find the optimal Bayesian network structure.
- Author
-
Tan, Xiangyuan, Gao, Xiaoguang, Wang, Zidong, and He, Chuchao
- Subjects
- *
HEURISTIC , *SEARCH algorithms , *TABU search algorithm , *MACHINE learning , *HEURISTIC algorithms - Abstract
• A bidirectional search algorithm for learning Bayesian network structure is proposed. • The proposed algorithm learns the optimal structure of a Bayesian network. • Heuristic functions are admissible and consistent for forward and backward search. • These functions ensure that the algorithm converges to the optimal solution. Bayesian networks have many applications. Learning the optimal structure of a Bayesian network has always been important in this respect. In this paper, a bidirectional heuristic search algorithm is proposed for the order graph space commonly used in a Bayesian network. At the same time, heuristic functions that are admissible and consistent in terms of both forward and backward search are proposed to ensure convergence of the algorithm to the optimal solution. The experimental results show that, compared with traditional unidirectional heuristic search, in most cases, the bidirectional heuristic search proposed in this paper needs to expand fewer states, the convergence efficiency is higher, and less running time is needed. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
50. A novel method for time series prediction based on error decomposition and nonlinear combination of forecasters.
- Author
-
Chen, Wei, Xu, Huilin, Chen, Zhensong, and Jiang, Manrui
- Subjects
- *
TIME series analysis , *FUTUROLOGISTS , *BOX-Jenkins forecasting , *GENERATING functions - Abstract
For time series prediction, hybrid systems that combine linear and nonlinear models can provide more accurate performance than a single model. However, the irregularity of the error series and the unknown nature of combinations of different forecasters may strongly impact the performance of hybrid systems. Therefore, in this paper, we propose a novel method for time series prediction, in which error decomposition and a nonlinear combination of forecasters are introduced. The proposed method performs the following: (i) linear modeling to obtain the error series, (ii) error decomposition by using variational mode decomposition (VMD), (iii) nonlinear modeling and a phase fix procedure for the error subseries, and (iv) a combination of forecasters through an appropriate combination function generated by a nonlinear model. By using the proposed method, this paper constructs two hybrid systems, in which the autoregressive integrated moving average (ARIMA) is used for linear modeling, and two artificial intelligence (AI) models, namely, the multilayer perceptron (MLP) and support vector regression (SVR), are used for nonlinear modeling and combination, respectively. Finally, four time series data sets, six evaluation metrics, two single models and thirteen hybrid systems are used to assess the effectiveness of the proposed method. The empirical results show that hybrid systems based on error decomposition and a nonlinear combination of forecasters can achieve better performance than some existing systems and models. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.