Descriptor: "Stochastic gradient descent" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Stochastic gradient descent"' showing total 4,827 results

Start Over Descriptor "Stochastic gradient descent"

4,827 results on '"Stochastic gradient descent"'

1. Optimization of Indian Sign Language Detection Using Data Generators

Author: Krishna, Kata Venkata Sai, Abhilash, Goli, Relan, Devanjali, Khatter, Kiran, Ghosh, Ashish, Editorial Board Member, Dev, Amita, editor, Sharma, Arun, editor, Agrawal, S. S., editor, and Rani, Ritu, editor
Published: 2025
Full Text: View/download PDF

2. Detecting Implicit Aspects of Customer Experience in the Hotel Industry Using a Machine Learning Algorithm

Author: Jayanthi, S., Arumugam, S. S., Ghosh, Ashish, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Geetha, R., editor, Dao, Nhu-Ngoc, editor, and Khalid, Saeed, editor
Published: 2025
Full Text: View/download PDF

3. Label synchronization for Hybrid Federated Learning in manufacturing and predictive maintenance.

Author: Llasag Rosero, Raúl, Silva, Catarina, Ribeiro, Bernardete, and Santos, Bruno F.
Subjects: ARTIFICIAL neural networks, FEDERATED learning, REMAINING useful life, MACHINE learning, ARTIFICIAL intelligence
Abstract: Artificial Intelligence (AI) is transforming the future of industries by introducing new paradigms. To address data privacy and other challenges of decentralization, research has focused on Federated Learning (FL), which combines distributed Machine Learning (ML) models from multiple parties without exchanging confidential information. However, conventional FL methods struggle to handle situations where data samples have diverse features and sizes. We propose a Hybrid Federated Learning solution with label synchronization to overcome this challenge. Our FedLabSync algorithm trains a feed-forward Artificial Neural Network while alerts that it can aggregate knowledge of other ML architectures compatible with the Stochastic Gradient Descent algorithm by conducting a penalized collaborative optimization. We conducted two industrial case studies: product inspection in Bosch factories and aircraft component Remaining Useful Life predictions. Our experiments on decentralized data scenarios demonstrate that FedLabSync can produce a global AI model that achieves results on par with those of centralized learning methods. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

4. 基于多层次随机梯度下降的大规模图布局算法.

Author: 周颖鑫, 李学俊, 吴亚东, 张红英, 王娇, 张秋梅, and 王桂娟
Abstract: Large-scale graph layout remains a prominent focus in graph visualization research. While the stress model excels in representing global structure, its speed lags behind the spring-electric model, and its local structure quality is suboptimal. This paper proposed a graph layout algorithm, aimed at enhancing the efficiency of layout while preserving global structure and improving local quality. The model first utilized graph compression based on neighbor structure to generate a hierarchical graph structure, and then used the node optimal placement algorithm to initialize the node coordinates. Next, it improved the local quality of layout using a SGD layout algorithm based on positive and negative samples, and further enhanced the layout speed through multi-level algorithm. Finally, comparative experiments with existing layout models on 30 datasets of different scales demonstrate the effectiveness of the proposed model in terms of efficiency, layout quality and visualization. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

5. Optimization by linear kinetic equations and mean-field Langevin dynamics.

Author: Pareschi, Lorenzo
Subjects: *SIMULATED annealing, *STATISTICAL physics, *BOLTZMANN'S equation, *GLOBAL optimization, *MARKOV processes
Abstract: One of the most striking examples of the close connections between global optimization processes and statistical physics is the simulated annealing method, inspired by the famous Monte Carlo algorithm devised by Metropolis et al. in the middle of last century. In this paper, we show how the tools of linear kinetic theory allow the description of this gradient-free algorithm from the perspective of statistical physics and how convergence to the global minimum can be related to classical entropy inequalities. This analysis highlights the strong link between linear Boltzmann equations and stochastic optimization methods governed by Markov processes. Thanks to this formalism, we can establish the connections between the simulated annealing process and the corresponding mean-field Langevin dynamics characterized by a stochastic gradient descent approach. Generalizations to other selection strategies in simulated annealing that avoid the acceptance–rejection dynamic are also provided. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

6. Low-dimensional intrinsic dimension reveals a phase transition in gradient-based learning of deep neural networks.

Author: Tan, Chengli, Zhang, Jiangshe, Liu, Junmin, and Zhao, Zixiang
Abstract: Deep neural networks complete a feature extraction task by propagating the inputs through multiple modules. However, how the representations evolve with the gradient-based optimization remains unknown. Here we leverage the intrinsic dimension of the representations to study the learning dynamics and find that the training process undergoes a phase transition from expansion to compression under disparate training regimes. Surprisingly, this phenomenon is ubiquitous across a wide variety of model architectures, optimizers, and data sets. We demonstrate that the variation in the intrinsic dimension is consistent with the complexity of the learned hypothesis, which can be quantitatively assessed by the critical sample ratio that is rooted in adversarial robustness. Meanwhile, we mathematically show that this phenomenon can be analyzed in terms of the mutable correlation between neurons. Although the evoked activities obey a power-law decaying rule in biological circuits, we identify that the power-law exponent of the representations in deep neural networks predicted adversarial robustness well only at the end of the training but not during the training process. These results together suggest that deep neural networks are prone to producing robust representations by adaptively eliminating or retaining redundancies. The code is publicly available at https://github.com/cltan023/learning2022. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

7. Estimation of simultaneous equation models by backpropagation method using stochastic gradient descent.

Author: Pérez-Sánchez, Belén, Perea, Carmen, Duran Ballester, Guillem, and López-Espín, Jose J.
Subjects: ARTIFICIAL neural networks, NONLINEAR regression, STOCHASTIC models, REGRESSION analysis, LEAST squares, SIMULTANEOUS equations
Abstract: Simultaneous equation model (SEM) is an econometric technique traditionally used in economics but with many applications in other sciences. This model allows the bidirectional relationship between variables and a simultaneous relationship between the equation set. There are many estimators used for solving an SEM. Two-steps least squares (2SLS), three-steps least squares (3SLS), indirect least squares (ILS), etc. are some of the most used of them. These estimators let us obtain a value of the coefficient of an SEM showing the relationship between the variables. There are different works to study and compare the estimators of an SEM comparing the error in the prediction of the data, the computational cost, etc. Some of these works study the estimators from different paradigms such as classical statistics, Bayesian statistics, non-linear regression models, etc. This work proposes to assume an SEM as a particular case of an artificial neural networks (ANN), considering the neurons of the ANN as the variables of the SEM and the weight of the connections of the neurons the coefficients of the SEM. Thus, backpropagation method using stochastic gradient descent (SGD) is proposed and studied as a new method to obtain the coefficient of an SEM. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

8. Robust Adaptive Sliding Mode Control Using Stochastic Gradient Descent for Robot Arm Manipulator Trajectory Tracking.

Author: Silaa, Mohammed Yousri, Barambones, Oscar, and Bencherif, Aissa
Subjects: GREY Wolf Optimizer algorithm, SLIDING mode control, STANDARD deviations, ROBUST control, ROBOT control systems, MANIPULATORS (Machinery)
Abstract: This paper presents an innovative control strategy for robot arm manipulators, utilizing an adaptive sliding mode control with stochastic gradient descent (ASMCSGD). The ASMCSGD controller significant improvements in robustness, chattering elimination, and fast, precise trajectory tracking. Its performance is systematically compared with super twisting algorithm (STA) and conventional sliding mode control (SMC) controllers, all optimized using the grey wolf optimizer (GWO). Simulation results show that the ASMCSGD controller achieves root mean squared errors (RMSE) of 0.12758 for θ 1 and 0.13387 for θ 2 . In comparison, the STA controller yields RMSE values of 0.1953 for θ 1 and 0.1953 for θ 2 , while the SMC controller results in RMSE values of 0.24505 for θ 1 and 0.29112 for θ 2 . Additionally, the ASMCSGD simplifies implementation, eliminates unwanted oscillations, and achieves superior tracking performance. These findings underscore the ASMCSGD's effectiveness in enhancing trajectory tracking and reducing chattering, making it a promising approach for robust control in practical applications of robot arm manipulators. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

9. Performance Evaluation of Gradient Descent Optimizers in Estuarine Turbidity Estimation with Multilayer Perceptron and Sentinel-2 Imagery.

Author: Ndou, Naledzani and Nontongana, Nolonwabo
Subjects: TURBIDITY, EMPIRICAL research, RADIANCE, INTERPOLATION, ESTUARIES
Abstract: Accurate monitoring of estuarine turbidity patterns is important for maintaining aquatic ecological balance and devising informed estuarine management strategies. This study aimed to enhance the prediction of estuarine turbidity patterns by enhancing the performance of the multilayer perceptron (MLP) network through the introduction of stochastic gradient descent (SGD) and momentum gradient descent (MGD). To achieve this, Sentinel-2 multispectral imagery was used as the base on which spectral radiance properties of estuarine waters were analyzed against field-measured turbidity data. In this case, blue, green, red, red edge, near-infrared and shortwave spectral bands were selected for empirical relationship establishment and model development. Inverse distance weighting (IDW) spatial interpolation was employed to produce raster-based turbidity data of the study area based on field-measured data. The IDW image was subsequently binarized using the bi-level thresholding technique to produce a Boolean image. Prior to empirical model development, the selected spectral bands were calibrated to turbidity using multilayer perceptron neural network trained with the sigmoid activation function with stochastic gradient descent (SGD) optimizer and then with sigmoid activation function with momentum gradient descent optimizer. The Boolean image produced from IDW interpolation was used as the base on which the sigmoid activation function calibrated image pixels to turbidity. Empirical models were developed using selected uncalibrated and calibrated spectral bands. The results from all the selected models generally revealed a stronger relationship of the red spectral channel with measured turbidity than with other selected spectral bands. Among these models, the MLP trained with MGD produced a coefficient of determination (r2) value of 0.92 on the red spectral band, followed by the MLP with MGD on the green spectral band and SGD on the red spectral band, with r2 values of 0.75 and 0.72, respectively. The relative error of mean (REM) and r2 results revealed accurate turbidity prediction by the sigmoid with MGD compared to other models. Overall, this study demonstrated the prospect of deploying ensemble techniques on Sentinel-2 multispectral bands in spatially constructing missing estuarine turbidity data. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

10. An evaluation of multiple classifiers for traffic congestion prediction in Jordan.

Author: Hassan, Mohammad and Arabiat, Areen
Subjects: TRAFFIC congestion, TRAFFIC estimation, TRAFFIC flow, RANDOM forest algorithms, DECISION trees
Abstract: This study contributes to the growing body of literature on traffic congestion prediction using machine learning (ML) techniques. By evaluating multiple classifiers and selecting the most appropriate one for predicting traffic congestion, this research provides valuable insights for urban planners and policymakers seeking to optimize traffic flow and reduce jamming and. Traffic jamming is a global issue that wastes time, pollutes the environment, and increases fuel usage. The purpose of this project is to forecast traffic congestion at One of the most congested areas in Amman city using multiple ML classifiers. The Naïve Bayes (NB), stochastic gradient descent (SGD) fuzzy unordered rule induction algorithm (FURIA), logistic regression (LR), decision tree (DT), random forest (RF), and multi-layer perceptron (MLP) classifiers have been chosen to predict traffic congestion at each street linked with our study area. These will be assessed by accuracy, F-measure, sensitivity, and precision evaluation metrics. The results obtained from all experiments show that FURIA is the classifier that presents the highest predictions of traffic congestion where By 100% achieved Accuracy, Precision, Sensitivity and F-measure. In the future further studies can be used more datasets and variables such as weather conditions; and drivers behavior that could integrated to predict traffic congestion accurately. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

11. Bolstering stochastic gradient descent with model building.

Author: Birbil, Ş. İlker, Martin, Özgür, Onay, Gönenç, and Öztoprak, Figen
Abstract: Stochastic gradient descent method and its variants constitute the core optimization algorithms that achieve good convergence rates for solving machine learning problems. These rates are obtained especially when these algorithms are fine-tuned for the application at hand. Although this tuning process can require large computational costs, recent work has shown that these costs can be reduced by line search methods that iteratively adjust the step length. We propose an alternative approach to stochastic line search by using a new algorithm based on forward step model building. This model building step incorporates second-order information that allows adjusting not only the step length but also the search direction. Noting that deep learning model parameters come in groups (layers of tensors), our method builds its model and calculates a new step for each parameter group. This novel diagonalization approach makes the selected step lengths adaptive. We provide convergence rate analysis, and experimentally show that the proposed algorithm achieves faster convergence and better generalization in well-known test problems. More precisely, SMB requires less tuning, and shows comparable performance to other adaptive methods. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

12. Stochastic optimal tuning for flight control system of morphing arm octorotor

Author: Kose, Oguz
Published: 2024
Full Text: View/download PDF

13. New logarithmic step size for stochastic gradient descent.

Author: Shamaee, Mahsa Soheil, Hafshejani, Sajad Fathi, and Saeidian, Zeinab
Abstract: In this paper, we propose a novel warm restart technique using a new logarithmic step size for the stochastic gradient descent (SGD) approach. For smooth and non-convex functions, we establish an O (1 T ) convergence rate for the SGD. We conduct a comprehensive implementation to demonstrate the efficiency of the newly proposed step size on the FashionMinst, CIFAR10, and CIFAR100 datasets. Moreover, we compare our results with nine other existing approaches and demonstrate that the new logarithmic step size improves test accuracy by 0.9% for the CIFAR100 dataset when we utilize a convolutional neural network (CNN) model. [ABSTRACT FROM AUTHOR]
Published: 2025
Full Text: View/download PDF

14. Byzantine fault tolerance in distributed machine learning: a survey.

Author: Bouhata, Djamila, Moumen, Hamouma, Mazari, Jocelyn Ahmed, and Bounceur, Ahcène
Subjects: *FAULT tolerance (Engineering), *MACHINE learning, *CLASSIFICATION, *TOPOLOGY, *CORPORA
Abstract: Byzantine Fault Tolerance (BFT) is crucial for ensuring the resilience of Distributed Machine Learning (DML) systems during training under adversarial conditions. Among the rising corpus of research on BFT in DML, there is no comprehensive classification of techniques or broad analysis of different approaches. This paper provides an in-depth survey of recent advancements in BFT for DML, with a focus on first-order optimisation methods, particularly, the popular one Stochastic Gradient Descent (SGD) during the training phase. We offer a novel classification of BFT approaches based on characteristics such as the communication process, optimisation method, and topology setting. This classification aims to enhance the understanding of various BFT methods and guide future research in addressing open challenges in the field. This work provides the foundations for developing robust BFT systems, using a variety of optimisation methods to strengthen resilience. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

15. Data Mining Technique for Diagnosing Autism Spectrum Disorder.

Author: Salman, Rasha Hani, Mortatha, Manar Bashar, and Nuiaa, Riydh Rahef
Subjects: *AUTISM spectrum disorders, *DATA mining, *EARLY diagnosis, *MEDICAL care costs, *LOGISTIC regression analysis
Abstract: Early detection of autistic symptoms can help lower overall medical expenses, which is beneficial given that autism is a developmental disease that is associated with high medical costs. To assess whether or not a kid may have autism spectrum disorder (ASD), screening for ASD involves asking the child's parents, caregivers, and other members of the child's immediate family a series of questions. The current methods for screening for autism, such as the autistic quotient (AQ), might require a significant number of questions in addition to careful question design, which can make an autism examination more time-consuming. The effectiveness and reliability of the test could be improved, for example, by employing data mining strategies. It could be possible to create a system that can foretell ASD at an early stage and give patients, caregivers, and medical professionals dependable and precise findings on the probable need for expert diagnostic services. This research aims to develop a reliable model for estimating the likelihood of an individual being diagnosed with autism spectrum disorder between the ages of 4 and 17. To identify varying degrees of autism, one such model was constructed by utilizing the stochastic gradient descent (SGD) algorithm. Mining data is typically understood to be a decision- making process that enables more effective utilization of available resources in terms of overall performance. The results showed that the suggested prediction model, which used the stochastic gradient descent (SGD) algorithm, could find ASD with an average error of 0.03% and an accuracy of up to 94.5%. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

16. BEHAVIOR ANALYSIS PRESENTED SYSYEM WITH FAILURE AND MAINTENANCE RATE WITH USING DEEP LEARNING ALGORITHMS.

Author: Singla, Shakuntla, Rani, Shilpa, Mangla, Diksha, and Modibbo, Umar Muhammad
Subjects: *DEEP learning, *DISTRIBUTION (Probability theory)
Abstract: The paper discusses the behavioral analysis and dependability of a three-unit system utilizing RPGT for system parameters. Since all three units P, Q and R include parallel subcomponents, in the event that one of them fails, the system continues to operate although at a reduced capacity, but it is not profitable to run the system when two units are in reduced state hence considered failed state. The rates of failures are exponentially distributed, but the rates of repair are generalized, independent, and differ based on the operational unit. Fuzzy concept is used to declare/determine whether the system is in failed/reduced/failed state. Graphs and tables are drawn to compare failure/repair effect on the parameters values. The system parameters are modelled using Regenerative Point graphical Technique (RPGT) and optimized using Deep learning methods such as Adam, SGD, RMS prop. The results of the optimization may be used to validate and challenge existing models and assumptions about the systems. [ABSTRACT FROM AUTHOR]
Published: 2024

17. Multi-Innovation Nesterov accelerated gradient parameter identification method for autoregressive exogenous models.

Author: Liang, Shuning, Xiao, Bo, Wang, Chunyang, Wang, Zishuo, and Wang, Lin
Subjects: *AUTOREGRESSIVE models, *COMPUTER simulation, *ALGORITHMS, *SPEED
Abstract: This paper proposed a multi-innovation Nesterov accelerated gradient (MNAG) parameter identification method for the autoregressive exogenous (ARX) model. First, a momentum acceleration term is added stochastic gradient descent (SGD) algorithm to increase the convergence rate of the SGD. Second, the parameter updating process is expanded from a single batch of current information iteration to the multiple batches of both previous and current information iteration, which extended the algorithm from single-innovation Nesterov accelerated gradient (NAG) to multi-innovation NAG parameter identification method. That enhances the algorithm's anti-noise and anti-abnormal data abilities, and its data utilization rate. Then, the convergence of the MNAG parameter identification method is proven. The effectiveness of the MNAG parameter identification method is verified by numerical simulation and the rotational speed system of a ring-pendulum double-sided polisher. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

18. Stochastic gradient descent without full data shuffle: with applications to in-database machine learning and deep learning systems.

Author: Xu, Lijie, Qiu, Shuang, Yuan, Binhang, Jiang, Jiawei, Renggli, Cedric, Gan, Shaoduo, Kara, Kaan, Li, Guoliang, Liu, Ji, Wu, Wentao, Ye, Jieping, and Zhang, Ce
Abstract: Modern machine learning (ML) systems commonly use stochastic gradient descent (SGD) to train ML models. However, SGD relies on random data order to converge, which usually requires a full data shuffle. For in-DB ML systems and deep learning systems with large datasets stored on block-addressable secondary storage such as HDD and SSD, this full data shuffle leads to low I/O performance—the data shuffling time can be even longer than the training itself, due to massive random data accesses. To balance the convergence rate of SGD (which favors data randomness) and its I/O performance (which favors sequential access), previous work has proposed several data shuffling strategies. In this paper, we first perform an empirical study on existing data shuffling strategies, showing that these strategies suffer from either low performance or low convergence rate. To solve this problem, we propose a simple but novel two-level data shuffling strategy named CorgiPile, which can avoid a full data shuffle while maintaining comparable convergence rate of SGD as if a full shuffle were performed. We further theoretically analyze the convergence behavior of CorgiPile and empirically evaluate its efficacy in both in-DB ML and deep learning systems. For in-DB ML systems, we integrate CorgiPile into PostgreSQL by introducing three new physical operators with optimizations. For deep learning systems, we extend single-process CorgiPile to multi-process CorgiPile for the parallel/distributed environment and integrate it into PyTorch. Our evaluation shows that CorgiPile can achieve comparable convergence rate with the full-shuffle-based SGD for both linear models and deep learning models. For in-DB ML with linear models, CorgiPile is 1.6 × - 12.8 × faster than two state-of-the-art systems, Apache MADlib and Bismarck, on both HDD and SSD. For deep learning models on ImageNet, CorgiPile is 1.5 × faster than PyTorch with full data shuffle. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

19. Modified Step Size for Enhanced Stochastic Gradient Descent: Convergence and Experiments.

Author: Shamaee, Mahsa Soheil and Hafshejani, Sajad Fathi
Subjects: MATHEMATICS education, NUMERICAL analysis, ALGORITHMS, IMAGE analysis, ACCURACY
Abstract: This paper introduces a novel approach to enhance the performance of the stochastic gradient descent (SGD) algorithm by incorporating a modified decay step size based on .... The proposed step size integrates a logarithmic term, leading to the selection of smaller values in the final iterations. Our analysis establishes a convergence rate of O ... for smooth non-convex functions without the Polyak-Łojasiewicz condition. To evaluate the effectiveness of our approach, we conducted numerical experiments on image classification tasks using the Fashion-MNIST and CIFAR10 datasets, and the results demonstrate significant improvements in accuracy, with enhancements of 0:5% and 1:4% observed, respectively, compared to the traditional p1 t step size. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

20. Stochastic Steffensen method.

Author: Zhao, Minda, Lai, Zehua, and Lim, Lek-Heng
Subjects: QUASI-Newton methods, NEWTON-Raphson method, GENERALIZATION, ALGORITHMS, SPEED
Abstract: Is it possible for a first-order method, i.e., only first derivatives allowed, to be quadratically convergent? For univariate loss functions, the answer is yes—the Steffensen method avoids second derivatives and is still quadratically convergent like Newton method. By incorporating a specific step size we can even push its convergence order beyond quadratic to 1 + 2 ≈ 2.414 . While such high convergence orders are a pointless overkill for a deterministic algorithm, they become rewarding when the algorithm is randomized for problems of massive sizes, as randomization invariably compromises convergence speed. We will introduce two adaptive learning rates inspired by the Steffensen method, intended for use in a stochastic optimization setting and requires no hyperparameter tuning aside from batch size. Extensive experiments show that they compare favorably with several existing first-order methods. When restricted to a quadratic objective, our stochastic Steffensen methods reduce to randomized Kaczmarz method—note that this is not true for SGD or SLBFGS—and thus we may also view our methods as a generalization of randomized Kaczmarz to arbitrary objectives. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

21. Natural Gradient Variational Bayes Without Fisher Matrix Analytic Calculation and Its Inversion.

Author: Godichon-Baggioni, A., Nguyen, D., and Tran, M.-N.
Subjects: *MATRIX inversion, *FISHER information, *INFERENTIAL statistics, *BAYESIAN analysis, *MATRICES (Mathematics)
Abstract: AbstractThis article introduces a method for efficiently approximating the inverse of the Fisher information matrix, a crucial step in achieving effective variational Bayes inference. A notable aspect of our approach is the avoidance of analytically computing the Fisher information matrix and its explicit inversion. Instead, we introduce an iterative procedure for generating a sequence of matrices that converge to the inverse of Fisher information. The natural gradient variational Bayes algorithm without analytic expression of the Fisher matrix and its inversion is provably convergent and achieves a convergence rate of order O( log s/s) , with s the number of iterations. We also obtain a central limit theorem for the iterates. Implementation of our method does not require storage of large matrices, and achieves a linear complexity in the number of variational parameters. Our algorithm exhibits versatility, making it applicable across a diverse array of variational Bayes domains, including Gaussian approximation and normalizing flow Variational Bayes. We offer a range of numerical examples to demonstrate the efficiency and reliability of the proposed variational Bayes method. Supplementary materials for this article are available online, including a standardized description of the materials available for reproducing the work. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

22. The stochastic multi-gradient algorithm for multi-objective optimization and its application to supervised machine learning.

Author: Liu, S. and Vicente, L. N.
Subjects: *SUPERVISED learning, *APPROXIMATION algorithms, *DECISION making, *ALGORITHMS, *A priori
Abstract: Optimization of conflicting functions is of paramount importance in decision making, and real world applications frequently involve data that is uncertain or unknown, resulting in multi-objective optimization (MOO) problems of stochastic type. We study the stochastic multi-gradient (SMG) method, seen as an extension of the classical stochastic gradient method for single-objective optimization. At each iteration of the SMG method, a stochastic multi-gradient direction is calculated by solving a quadratic subproblem, and it is shown that this direction is biased even when all individual gradient estimators are unbiased. We establish rates to compute a point in the Pareto front, of order similar to what is known for stochastic gradient in both convex and strongly convex cases. The analysis handles the bias in the multi-gradient and the unknown a priori weights of the limiting Pareto point. The SMG method is framed into a Pareto-front type algorithm for calculating an approximation of the entire Pareto front. The Pareto-front SMG algorithm is capable of robustly determining Pareto fronts for a number of synthetic test problems. One can apply it to any stochastic MOO problem arising from supervised machine learning, and we report results for logistic binary classification where multiple objectives correspond to distinct-sources data groups. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

23. Optimized convergence of stochastic gradient descent by weighted averaging.

Author: Hagedorn, Melinda and Jarre, Florian
Subjects: *ARITHMETIC mean, *STOCHASTIC convergence, *ARITHMETIC, *NOISE, *GENERALIZATION
Abstract: Under mild assumptions stochastic gradient methods asymptotically achieve an optimal rate of convergence if the arithmetic mean of all iterates is returned as an approximate optimal solution. However, in the absence of stochastic noise, the arithmetic mean of all iterates converges considerably slower to the optimal solution than the iterates themselves. And also in the presence of noise, when a termination of the stochastic gradient method after a finite number of steps is considered, the arithmetic mean is not necessarily the best possible approximation to the unknown optimal solution. This paper aims at identifying optimal strategies in a particularly simple case, the minimization of a strongly convex function with i. i. d. noise terms and termination after a finite number of steps. Explicit formulas for the stochastic error and the optimality error are derived in dependence of certain parameters of the SGD method. The aim was to choose parameters such that both stochastic error and optimality error are reduced compared to arithmetic averaging. This aim could not be achieved; however, by allowing a slight increase of the stochastic error it was possible to select the parameters such that a significant reduction of the optimality error could be achieved. This reduction of the optimality error has a strong effect on the approximate solution generated by the stochastic gradient method in case that only a moderate number of iterations is used or when the initial error is large. The numerical examples confirm the theoretical results and suggest that a generalization to non-quadratic objective functions may be possible. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

24. On the Inversion‐Free Newton's Method and Its Applications.

Author: Chau, Huy N., Kirkby, J. Lars, Nguyen, Dang H., Nguyen, Duy, Nguyen, Nhu N., and Nguyen, Thai
Abstract: Summary: In this paper, we survey the recent development of inversion‐free Newton's method, which directly avoids computing the inversion of Hessian, and demonstrate its applications in estimating parameters of models such as linear and logistic regression. A detailed review of existing methodology is provided, along with comparisons of various competing algorithms. We provide numerical examples that highlight some deficiencies of existing approaches, and demonstrate how the inversion‐free methods can improve performance. Motivated by recent works in literature, we provide a unified subsampling framework that can be combined with the inversion‐free Newton's method to estimate model parameters including those of linear and logistic regression. Numerical examples are provided for illustration. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

25. Comparative Analysis of Manifold Learning-Based Dimension Reduction Methods: A Mathematical Perspective.

Author: Yi, Wenting, Bu, Siqi, Lee, Hiu-Hung, and Chan, Chun-Hung
Subjects: *DATA structures, *FUZZY topology, *COMPARATIVE studies, *BIOINFORMATICS, *ACQUISITION of data
Abstract: Manifold learning-based approaches have emerged as prominent techniques for dimensionality reduction. Among these methods, t-Distributed Stochastic Neighbor Embedding (t-SNE) and Uniform Manifold Approximation and Projection (UMAP) stand out as two of the most widely used and effective approaches. While both methods share similar underlying procedures, empirical observations indicate two distinctive properties: global data structure preservation and computational efficiency. However, the underlying mathematical principles behind these distinctions remain elusive. To address this gap, this study presents a comparative analysis of the subprocesses involved in these methods, aiming to elucidate the mathematical mechanisms underlying the observed distinctions. By meticulously examining the equation formulations, the mathematical mechanisms contributing to global data structure preservation and computational efficiency are elucidated. To validate the theoretical analysis, data are collected through a laboratory experiment, and an open-source dataset is utilized for validation across different datasets. The consistent alignment of results obtained from both balanced and unbalanced datasets robustly confirms the study's findings. The insights gained from this study provide a deeper understanding of the mathematical underpinnings of t-SNE and UMAP, enabling more informed and effective use of these dimensionality reduction techniques in various applications, such as anomaly detection, natural language processing, and bioinformatics. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

26. Reranking Hypotheses in Translation Models Using Human Markup.

Author: Vorontsov, K. V. and Skachkov, N. A.
Abstract: Modern machine translation systems are trained on large volumes of parallel data obtained using heuristic methods of bypassing the Internet. The poor quality of the data leads to systematic translation errors, which can be quite noticeable to humans. To fix such errors, human-based models for reranking hypotheses is introduced in this study. In this paper the use of human markup is shown not only to increase the overall quality of the translation but also to significantly reduce the number of systematic translation errors. In addition, the relative simplicity of human markup and its integration in the model training process opens up new opportunities in the field of domain adaptation of translation models for new domains like online retail. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

27. Stochastic Subgradient Descent Escapes Active Strict Saddles on Weakly Convex Functions.

Author: Bianchi, Pascal, Hachem, Walid, and Schechtman, Sholom
Subjects: CONVEX functions, NONSMOOTH optimization, SADDLERY, CURVATURE
Abstract: In nonsmooth stochastic optimization, we establish the nonconvergence of the stochastic subgradient descent (SGD) to the critical points recently called active strict saddles by Davis and Drusvyatskiy. Such points lie on a manifold M, where the function f has a direction of second-order negative curvature. Off this manifold, the norm of the Clarke subdifferential of f is lower-bounded. We require two conditions on f. The first assumption is a Verdier stratification condition, which is a refinement of the popular Whitney stratification. It allows us to establish a strengthened version of the projection formula of Bolte et al. for Whitney stratifiable functions and which is of independent interest. The second assumption, termed the angle condition, allows us to control the distance of the iterates to M. When f is weakly convex, our assumptions are generic. Consequently, generically, in the class of definable weakly convex functions, SGD converges to a local minimizer. Funding: The work of Sholom Schechtman was supported by "Région Ile-de-France". [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

28. A stochastic gradient relational event additive model for modelling US patent citations from 1976 to 2022.

Author: Filippi-Mazzola, Edoardo and Wit, Ernst C
Subjects: CITATION networks, STOCHASTIC models, PATENTS, ADDITIVES
Abstract: Until 2022, the US patent citation network contained almost 10 million patents and over 100 million citations, presenting a challenge in analysing such expansive, intricate networks. To overcome limitations in analysing this complex citation network, we propose a stochastic gradient relational event additive model (STREAM) that models the citation relationships between patents as time events. While the structure of this model relies on the relational event model, STREAM offers a more comprehensive interpretation by modelling the effect of each predictor non-linearly. Overall, our model identifies key factors driving patent citations and reveals insights in the citation process. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

29. Fast maximum likelihood estimation for general hierarchical models.

Author: Hong, Johnny, Stoudt, Sara, and de Valpine, Perry
Subjects: *MAXIMUM likelihood statistics, *MONTE Carlo method, *LATENT variables, *APPLIED sciences, *STATISTICAL models, *MARKOV chain Monte Carlo
Abstract: Hierarchical statistical models are important in applied sciences because they capture complex relationships in data, especially when variables are related by space, time, sampling unit, or other shared features. Existing methods for maximum likelihood estimation that rely on Monte Carlo integration over latent variables, such as Monte Carlo Expectation Maximization (MCEM), suffer from drawbacks in efficiency and/or generality. We harness a connection between sampling-stepping iterations for such methods and stochastic gradient descent methods for non-hierarchical models: many noisier steps can do better than few cleaner steps. We call the resulting methods Hierarchical Model Stochastic Gradient Descent (HMSGD) and show that combining efficient, adaptive step-size algorithms with HMSGD yields efficiency gains. We introduce a one-dimensional sampling-based greedy line search for step-size determination. We implement these methods and conduct numerical experiments for a Gamma-Poisson mixture model, a generalized linear mixed models (GLMMs) with single and crossed random effects, and a multi-species ecological occupancy model with over 3000 latent variables. Our experiments show that the accelerated HMSGD methods provide faster convergence than commonly used methods and are robust to reasonable choices of MCMC sample size. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

30. A modification of adaptive moment estimation (adam) for machine learning.

Author: Yang, Jiaxin and Long, Qiang
Subjects: ARTIFICIAL neural networks, MACHINE learning, ALGORITHMS, GENERALIZATION
Abstract: In deep learning, the accuracy and generalization ability of the model largely depend on the optimization of the loss function. Up to now, dozens of optimization methods have been used in the deep learning models. Among them, stochastic gradient descent (SGD) is a popular and widely used method, and most of the other up-to-date optimization methods are variants or improvements of the original SGD. Among all the variations and improvements, the adaptive moment estimation (Adam) is one of the classics. However, Adam has also been pointed out to have non-convergence or error-convergence. Combining with the improvement points of the existing algorithms, this paper proposes an improved algorithm based on Adam, called NewAdam. NewAdam is modified from Adam in both search direction and learning rate. We perform a theoretical analysis on it and conduct numerical experiments on three data sets and two network architectures to illustrate the effectiveness of NewAdam. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

31. Risk Budgeting portfolios: Existence and computation.

Author: Cetingoz, Adil Rengim, Fermanian, Jean‐David, and Guéant, Olivier
Subjects: EXPECTED returns, PORTFOLIO management (Investments)
Abstract: Modern portfolio theory has provided for decades the main framework for optimizing portfolios. Because of its sensitivity to small changes in input parameters, especially expected returns, the mean–variance framework proposed by Markowitz in 1952 has, however, been challenged by new construction methods that are purely based on risk. Among risk‐based methods, the most popular ones are Minimum Variance, Maximum Diversification, and Risk Budgeting (especially Equal Risk Contribution) portfolios. Despite some drawbacks, Risk Budgeting is particularly attracting because of its versatility: based on Euler's homogeneous function theorem, it can indeed be used with a wide range of risk measures. This paper presents mathematical results regarding the existence and the uniqueness of Risk Budgeting portfolios for a very wide spectrum of risk measures and shows that, for many of them, computing the weights of Risk Budgeting portfolios only requires a standard stochastic algorithm. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

32. Estimation of simultaneous equation models by backpropagation method using stochastic gradient descent

Author: Belén Pérez-Sánchez, Carmen Perea, Guillem Duran Ballester, and Jose J. López-Espín
Subjects: Backpropagation method, Stochastic gradient descent, Simultaneous equation models, Artificial neural networks, Electronic computers. Computer science, QA75.5-76.95
Abstract: Simultaneous equation model (SEM) is an econometric technique traditionally used in economics but with many applications in other sciences. This model allows the bidirectional relationship between variables and a simultaneous relationship between the equation set. There are many estimators used for solving an SEM. Two-steps least squares (2SLS), three-steps least squares (3SLS), indirect least squares (ILS), etc. are some of the most used of them. These estimators let us obtain a value of the coefficient of an SEM showing the relationship between the variables. There are different works to study and compare the estimators of an SEM comparing the error in the prediction of the data, the computational cost, etc. Some of these works study the estimators from different paradigms such as classical statistics, Bayesian statistics, non-linear regression models, etc. This work proposes to assume an SEM as a particular case of an artificial neural networks (ANN), considering the neurons of the ANN as the variables of the SEM and the weight of the connections of the neurons the coefficients of the SEM. Thus, backpropagation method using stochastic gradient descent (SGD) is proposed and studied as a new method to obtain the coefficient of an SEM.
Published: 2024
Full Text: View/download PDF

33. Gradient Correction for Asynchronous Stochastic Gradient Descent in Reinforcement Learning

Author: Gao, Jiaxin, Lyu, Yao, Wang, Wenxuan, Yin, Yuming, Ma, Fei, Li, Shengbo Eben, Chaari, Fakher, Series Editor, Gherardini, Francesco, Series Editor, Ivanov, Vitalii, Series Editor, Haddar, Mohamed, Series Editor, Cavas-Martínez, Francisco, Editorial Board Member, di Mare, Francesca, Editorial Board Member, Kwon, Young W., Editorial Board Member, Tolio, Tullio A. M., Editorial Board Member, Trojanowska, Justyna, Editorial Board Member, Schmitt, Robert, Editorial Board Member, Xu, Jinyang, Editorial Board Member, Mastinu, Giampiero, editor, Braghin, Francesco, editor, Cheli, Federico, editor, Corno, Matteo, editor, and Savaresi, Sergio M., editor
Published: 2024
Full Text: View/download PDF

34. Cocoa Beans Quality Prediction Using Near-Infrared Spectroscopy and Several Machine Learning Techniques

Author: Khandelwal, Rishabh, Harine, M., Das, Sanchali, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Hirche, Sandra, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Oneto, Luca, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Tan, Kay Chen, Series Editor, Santosh, K. C., editor, Sood, Sandeep Kumar, editor, Pandey, Hari Mohan, editor, and Virmani, Charu, editor
Published: 2024
Full Text: View/download PDF

35. Fully Homomorphic Encrypted Wavelet Neural Network for Privacy-Preserving Bankruptcy Prediction in Banks

Author: Ahamed, Syed Imtiaz, Ravi, Vadlamani, Gopi, Pranay, Kacprzyk, Janusz, Series Editor, Jain, Lakhmi C., Series Editor, Maglaras, Leandros A., editor, Das, Sonali, editor, Tripathy, Naliniprava, editor, and Patnaik, Srikanta, editor
Published: 2024
Full Text: View/download PDF

36. Generalizing Self-organizing Maps: Large-Scale Training of GMMs and Applications in Data Science

Author: Gepperth, Alexander, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Villmann, Thomas, editor, Kaden, Marika, editor, Geweniger, Tina, editor, and Schleif, Frank-Michael, editor
Published: 2024
Full Text: View/download PDF

37. Dynamic Growing and Shrinking of Neural Networks with Monte Carlo Tree Search

Author: Świderski, Szymon, Jastrzȩbska, Agnieszka, Hartmanis, Juris, Founding Editor, van Leeuwen, Jan, Series Editor, Hutchison, David, Editorial Board Member, Kanade, Takeo, Editorial Board Member, Kittler, Josef, Editorial Board Member, Kleinberg, Jon M., Editorial Board Member, Kobsa, Alfred, Series Editor, Mattern, Friedemann, Editorial Board Member, Mitchell, John C., Editorial Board Member, Naor, Moni, Editorial Board Member, Nierstrasz, Oscar, Series Editor, Pandu Rangan, C., Editorial Board Member, Sudan, Madhu, Series Editor, Terzopoulos, Demetri, Editorial Board Member, Tygar, Doug, Editorial Board Member, Weikum, Gerhard, Series Editor, Vardi, Moshe Y, Series Editor, Goos, Gerhard, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Franco, Leonardo, editor, de Mulatier, Clélia, editor, Paszynski, Maciej, editor, Krzhizhanovskaya, Valeria V., editor, Dongarra, Jack J., editor, and Sloot, Peter M. A., editor
Published: 2024
Full Text: View/download PDF

38. Learning Rate Scheduler for Multi-criterion Movie Recommender System

Author: Airen, Sonu, Agrawal, Jitendra, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Senjyu, Tomonobu, editor, So–In, Chakchai, editor, and Joshi, Amit, editor
Published: 2024
Full Text: View/download PDF

39. A Study on Parallel Recommender System with Stream Data Using Stochastic Gradient Descent

Author: Si, Thin Nguyen, Van Hung, Trong, Ngoc, Dat Vo, Le, Quan Ngo, Kacprzyk, Janusz, Series Editor, and Lee, Roger, editor
Published: 2024
Full Text: View/download PDF

40. Using Incremental Algorithm in Hybrid Recommender System Combined Sentiment Analysis

Author: Si, Thin Nguyen, Van Hung, Trong, Kacprzyk, Janusz, Series Editor, and Lee, Roger, editor
Published: 2024
Full Text: View/download PDF

41. Probabilistic Guarantees of Stochastic Recursive Gradient in Non-convex Finite Sum Problems

Author: Zhong, Yanjie, Li, Jiaqi, Lahiri, Soumendra, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Yang, De-Nian, editor, Xie, Xing, editor, Tseng, Vincent S., editor, Pei, Jian, editor, Huang, Jen-Wei, editor, and Lin, Jerry Chun-Wei, editor
Published: 2024
Full Text: View/download PDF

42. Protocol Anomaly Detection in IIoT

Author: Prasanna, S. S., Emil Selvan, G. S. R., Ramkumar, M. P., Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Das, Prodipto, editor, Begum, Shahin Ara, editor, and Buyya, Rajkumar, editor
Published: 2024
Full Text: View/download PDF

43. Diabetes Risk Prediction Through Fine-Tuned Gradient Boosting

Author: Rani, Pooja, Lamba, Rohit, Sachdeva, Ravi Kumar, Jain, Anurag, Choudhury, Tanupriya, Kotecha, Ketan, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Garg, Deepak, editor, Rodrigues, Joel J. P. C., editor, Gupta, Suneet Kumar, editor, Cheng, Xiaochun, editor, Sarao, Pushpender, editor, and Patel, Govind Singh, editor
Published: 2024
Full Text: View/download PDF

44. In the Shadow of RoBERTA: Is the Classical ML Drawing Its Last Breath in Sentiment Analysis?

Author: Mojžiš, Ján, Kvassay, Marcel, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Silhavy, Radek, editor, and Silhavy, Petr, editor
Published: 2024
Full Text: View/download PDF

45. Shear capacity assessment of perforated steel plate shear wall based on the combination of verified finite element analysis, machine learning, and gene expression programming

Author: Bypour, Maryam, Mahmoudian, Alireza, Tajik, Nima, Taleshi, Mostafa Mohammadzadeh, Mirghaderi, Seyed Rasoul, and Yekrangnia, Mohammad
Published: 2024
Full Text: View/download PDF

46. Convergence Rates for Stochastic Approximation: Biased Noise with Unbounded Variance, and Applications

Author: Karandikar, Rajeeva Laxman and Vidyasagar, Mathukumalli
Published: 2024
Full Text: View/download PDF

47. Decentralised and privacy-preserving machine learning approach for distributed data resources

Author: Alkhozae, Mona, Zeng, Xiaojun, and Chen, Ke
Subjects: Stochastic Gradient Descent, Nonlinear Model Combination, Linear Model Combination, Stepwise Models Selection, Decentralised Machine Learning, Distributed Data Resources, Gossip Learning
Abstract: Distributed machine learning has become a significant approach due to the high demand for distributed and large-scale data processing. However, some issues related to distributed machine learning for distributed data resources, including data transfer restrictions, privacy, and communication and computation costs have not been properly addressed. Therefore, it brings challenges to tackle these issues when developing a distributed learning method without data sharing between the distributed sites, centralising the distributed data resources for central learning, or using complicated learning methods. In this thesis, we addressed these issues by developing decentralised privacy-preserving learning approaches that allow distributed sites utilising distributed data resources to construct global and local combined prediction models without sharing, moving distributed data to a centralised database or using a central location for iterative communication or computation. Furthermore, the exchanged information between distributed sites is restricted to only trained local models and information about models performance to overcome data restriction issues, privacy concerns, and minimising data transformation costs. We focused on several model selection and combination strategies to achieve the optimal combined global and local models that maximise the combined models predictive performance. We selected and combined the best models using linear and nonlinear combination methods, stepwise models selection and combination method, and by using all possible sites sequence combinations approach. The experimental evaluation conducted on different classification and regression datasets demonstrated that our approach performed comparably or better than the centralised learning approach or other existing distributed learning methods in most datasets. Furthermore, we overcame data privacy concerns and server issues by avoiding data sharing or centralisation or using a server for iterative learning or intermediate models updates sharing. This thesis contributes toward developing a simpler and effective machine learning approach and direction for decentralised privacy-preserving machine learning. It keeps data locally for each site and combines diverse and accurate models instead of complicated ways that increase communication and computational overheads without sacrificing predictive performance. Furthermore, it can be applied to large and distributed data resources that cannot be analysed in a single location, reduces coordination overhead for large-scale analyses, and reduces cost by avoiding a powerful central server requirement.
Published: 2023

48. Forecasting Energy Consumption in Smart Grids: A Comparative Analysis of Recurrent Neural Networks

Author: Yasir Al-Haddad, Abdullahi Abdu İbrahim, and Rajaa Naeem
Subjects: adams, energy forecasting, lstm, recurrent neural network, stochastic gradient descent, Technology
Abstract: In the present era of smart grids, accurate prediction of energy uses is becoming increasingly essential to guarantee optimal energy efficiency. This study contributes to the field by utilizing advanced machine learning techniques to perform predictions of energy consumption using the data from Internet of Things (IoT) devices. Specifically, the approach utilizes regression neural network (RNN) structures, such as long short-term memory (LSTM) and gated recurrent units (GRUs). The data from IoT sensors are more extensive and detailed than those of conventional smart meters, allowing for the development of more complex models of energy use patterns. This study utilizes Adam-optimized LSTM, RNN, and GRU models, along with stochastic gradient descent, to evaluate their performance in addressing the complexity of time-series data in energy forecasting on different network configurations. Result of the analysis indicates that LSTM models, which are run with the Adam optimizer, are more accurate in terms of predictions compared with the other models. This conclusion is supported by the test results of these models that are within the lowest root mean square error and mean absolute error scores. All the models under the analysis exhibit signs of overfitting based on the performance indicators for the training and the testing data. This notion implies that regularization should be utilized to ensure the improved generalizability of the models. These findings show that deep learning can have a lasting influence in improving energy consumption management systems to meet sustainability and energy efficiency requirements. These observations are beneficial for the gradual improvements of smart grids.
Published: 2024
Full Text: View/download PDF

49. Multiple Object Detection-Based Machine Learning Techniques.

Author: Hasan, Athraa S., Jianjun Yi, AlSabbagh, Haider M., and Liwei Chen
Subjects: *OBJECT recognition (Computer vision), *MACHINE learning, *K-nearest neighbor classification, *COMPUTER vision, *RANDOM forest algorithms, *DECISION trees
Abstract: Object detection has become faster and more precise due to improved computer vision systems. Many successful object detections have dramatically improved owing to the introduction of machine learning methods. This study incorporated cutting-edge methods for object detection to obtain high-quality results in a competitive timeframe comparable to human perception. Object-detecting systems often face poor performance issues. Therefore, this study proposed a comprehensive method to resolve the problem faced by the object detection method using six distinct machine learning approaches: stochastic gradient descent, logistic regression, random forest, decision trees, k-nearest neighbor, and naive Bayes. The system was trained using Common Objects in Context (COCO), the most challenging publicly available dataset. Notably, a yearly object detection challenge is held using COCO. The resulting technology is quick and precise, making it ideal for applications requiring an object detection accuracy of 97%. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

50. Kinematic modeling and simultaneous calibration for acupuncture robot.

Author: Zhang, Chi, Han, Yu, Liu, Wanquan, and Peng, Jianqing
Subjects: *ACUPUNCTURE, *KINEMATIC chains, *CALIBRATION, *ROBOTS, *MOBILE robots
Abstract: • The calibration problem of the general acupuncture robot is analyzed. • A simultaneous offline calibration method for problems defined with multiple closed kinematic chains is proposed. • A multi-objective optimization model covering all coupling components and constraints is constructed and solved. • A real-world platform with a customized calibration board is built. • The proposed method has higher calibration accuracy and convergence speed. Acupuncture robot is a new-era product combining traditional acupuncture and cutting-edge technology. The calibration of the vision system and the acupuncture mechanism is a crucial prerequisite for humanoid acupuncture control, which has not yet been explored. In this paper, a simultaneous offline calibration method is proposed for acupuncture robots. Analysis reveals that its calibration problem is defined with three closed kinematic chains, while the typical problems cover only a single chain. Decoupling them, a Kronecker product-based method is deduced to access the closed-form rotation component. As opposed to quaternion-based methods, it only experiences linear complexity in sign ambiguity, which can produce a more precise solution in finite time. Further, a simultaneous optimization model that encompasses all components is established, which can be solved with stochastic gradient descent-based methods. It is free of truncation errors and thus has higher calibration accuracy and convergence speed. Besides, the impairment in error propagation between different closed kinematic chains is mitigated compared to step-by-step methods. Finally, simulations and experiments are carried out. Notably, the proposed method can be easily extended to other robot calibration problems with multiple closed kinematic chains. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

4,827 results on '"Stochastic gradient descent"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources