Descriptor: "Machine Learning Algorithms" / Journal: ieee transactions on automatic control - Searchworks@Jio Institute Digital Library Search Results

1. Probably Approximately Correct Learning in Adversarial Environments With Temporal Logic Specifications.

Author: Wen, Min and Topcu, Ufuk
Subjects: *MACHINE learning, *REINFORCEMENT learning, *REWARD (Psychology), *CLASSROOM environment, *LOGIC, *PICTURE archiving & communication systems
Abstract: Reinforcement learning (RL) algorithms have been used to learn how to implement tasks in uncertain and partially unknown environments. In practice, environments are usually uncontrolled and may affect task performance in an adversarial way. In this article, we model the interaction between an RL agent and its potentially adversarial environment as a turn-based zero-sum stochastic game. The task requirements are represented both qualitatively as a subset of linear temporal logic (LTL) specifications, and quantitatively as a reward function. For each case in which the LTL specification is realizable and can be equivalently transformed into a deterministic Büchi automaton, we show that there always exists a memoryless almost-sure winning strategy that is $\varepsilon$ -optimal for the discounted-sum objective for any arbitrary positive $\varepsilon$. We propose a probably approximately correct (PAC) learning algorithm that learns such a strategy efficiently in an online manner with a priori unknown reward functions and unknown transition distributions. To the best of our knowledge, this is the first result on PAC learning in stochastic games with independent quantitative and qualitative objectives. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

2. “Second-Order Primal” + “First-Order Dual” Dynamical Systems With Time Scaling for Linear Equality Constrained Convex Optimization Problems.

Author: He, Xin, Hu, Rong, and Fang, Ya-Ping
Subjects: *DYNAMICAL systems, *CONSTRAINED optimization, *LINEAR dynamical systems, *ORDINARY differential equations
Abstract: Second-order dynamical systems are important tools for solving optimization problems, and most of the existing works in this field have focused on unconstrained optimization problems. In this article, we propose an inertial primal–dual dynamical system with constant viscous damping and time scaling for the linear equality constrained convex optimization problem, which consists of a second-order ordinary differential equation (ODE) for the primal variable and a first-order ODE for the dual variable. When the scaling satisfies certain conditions, we prove its convergence property without assuming strong convexity. Even the convergence rate can become exponential when the scaling grows exponentially. We also show that the obtained convergence property of the dynamical system is preserved under a small perturbation. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

3. An Optimal Computing Budget Allocation Tree Policy for Monte Carlo Tree Search.

Author: Li, Yunchuan, Fu, Michael C., and Xu, Jie
Subjects: *STOCHASTIC control theory, *MONTE Carlo method, *BUDGET cuts, *TREES
Abstract: We analyze a tree search problem with an underlying Markov decision process, in which the goal is to identify the best action at the root that achieves the highest cumulative reward. We present a new tree policy that optimally allocates a limited computing budget to maximize a lower bound on the probability of correctly selecting the best action at each node. Compared to widely used upper confidence bound (UCB) tree policies, the new tree policy presents a more balanced approach to manage the exploration and exploitation tradeoff when the sampling budget is limited. Furthermore, UCB assumes that the support of reward distribution is known, whereas our algorithm relaxes this assumption. Numerical experiments demonstrate the efficiency of our algorithm in selecting the best action at the root. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

4. Distributed Randomized Gradient-Free Mirror Descent Algorithm for Constrained Optimization.

Author: Yu, Zhan, Ho, Daniel W. C., and Yuan, Deming
Subjects: *CONSTRAINED optimization, *MATHEMATICAL optimization, *NONSMOOTH optimization, *DISTRIBUTED algorithms, *MIRRORS, *EUCLIDEAN algorithm, *APPROXIMATION algorithms, *LINEAR programming
Abstract: This article is concerned with the multiagent optimization problem. A distributed randomized gradient-free mirror descent (DRGFMD) method is developed by introducing a randomized gradient-free oracle in the mirror descent scheme where the non-Euclidean Bregman divergence is used. The classical gradient descent method is generalized without using subgradient information of objective functions. The proposed algorithms are the first distributed non-Euclidean zeroth-order methods, which achieve an approximate $O(\frac{1}{\sqrt{T}})$ $T$ -rate of convergence, recovering the best known optimal rate of distributed nonsmooth constrained convex optimization. Moreover, a decentralized reciprocal weighted averaging (RWA) approximating sequence is first investigated, the convergence for RWA sequence is shown to hold over time-varying graph. Rates of convergence are comprehensively explored for the algorithm with RWA (DRGFMD-RWA). The technique on constructing the decentralized RWA sequence provides new insight in searching for minimizers in distributed algorithms. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

5. Differential Temporal Difference Learning.

Author: Devraj, Adithya M., Kontoyiannis, Ioannis, and Meyn, Sean P.
Subjects: *MACHINE learning, *MARKOV processes, *STOCHASTIC control theory, *CENTRAL limit theorem, *KEY performance indicators (Management), *REINFORCEMENT learning
Abstract: Value functions derived from Markov decision processes arise as a central component of algorithms as well as performance metrics in many statistics and engineering applications of machine learning. Computation of the solution to the associated Bellman equations is challenging in most practical cases of interest. A popular class of approximation techniques, known as temporal difference (TD) learning algorithms, are an important subclass of general reinforcement learning methods. The algorithms introduced in this article are intended to resolve two well-known issues with TD-learning algorithms. Their slow convergence due to very high central limit theorem variance, and the fact that, for the problem of computing the relative value function, consistent algorithms exist only in special cases. First we show that the gradients of these value functions admit a representation that lends itself to algorithm design. Based on this result, a new class of differential TD-learning algorithms is introduced. For Markovian models on Euclidean space with smooth dynamics, the algorithms are shown to be consistent under general conditions. Numerical results show dramatic variance reduction in comparison to standard methods. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

6. Constrained Cross-Entropy Method for Safe Reinforcement Learning.

Author: Wen, Min and Topcu, Ufuk
Subjects: *REINFORCEMENT learning, *CROSS-entropy method, *ALGORITHMS, *ORDINARY differential equations, *CONSTRAINED optimization
Abstract: We study a safe reinforcement learning problem, in which the constraints are defined as the expected cost over finite-length trajectories. We propose a constrained cross-entropy-based method to solve this problem. The key idea is to transform the original constrained optimization problem into an unconstrained one with a surrogate objective. The method explicitly tracks its performance with respect to constraint satisfaction and thus is well suited for safety-critical applications. We show that the asymptotic behavior of the proposed algorithm can be almost-surely described by that of an ordinary differential equation. Then, we give sufficient conditions on the properties of this differential equation for the convergence of the proposed algorithm. At last, we show the performance of the proposed algorithm in two simulation examples. In a constrained linear–quadratic regulator example, we observe that the algorithm converges to the global optimum with high probability. In a 2-D navigation example, we find that the algorithm effectively learns feasible policies without assumptions on the feasibility of initial policies, even with non-Markovian objective functions and constraint functions. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

7. A Second-Order Proximal Algorithm for Consensus Optimization.

Author: Wu, Xuyang, Qu, Zhihai, and Lu, Jie
Subjects: *MATHEMATICAL optimization, *COST functions, *LAGRANGIAN functions, *DISTRIBUTED algorithms, *MACHINE learning, *ALGORITHMS
Abstract: We develop a distributed second-order proximal algorithm, referred to as SoPro, to address in-network consensus optimization. The proposed SoPro algorithm converges linearly to the exact optimal solution, provided that the global cost function is locally restricted strongly convex. This relaxes the standard global strong convexity condition required by the existing distributed optimization algorithms to establish linear convergence. In addition, we demonstrate that SoPro is computation- and communication-efficient in comparison with the state-of-the-art distributed second-order methods. Finally, extensive simulations illustrate the competitive convergence performance of SoPro. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

8. Distributed Proximal Algorithms for Multiagent Optimization With Coupled Inequality Constraints.

Author: Li, Xiuxian, Feng, Gang, and Xie, Lihua
Subjects: *MATHEMATICAL optimization, *TIME-varying networks, *CONVEX functions, *LINEAR programming, *MACHINE learning
Abstract: This article aims to address distributed optimization problems over directed and time-varying networks, where the global objective function consists of a sum of locally accessible convex objective functions subject to a feasible set constraint and coupled inequality constraints whose information is only partially accessible to each agent. For this problem, a distributed proximal-based algorithm, called distributed proximal primal-dual algorithm, is proposed based on the celebrated centralized proximal point algorithm. It is shown that the proposed algorithm can lead to the global optimal solution with a general step size, which is diminishing and nonsummable, but not necessarily square summable, and the saddle-point running evaluation error vanishes proportionally to O(1√k) , where k > 0 is the iteration number. Finally, a simulation example is presented to corroborate the effectiveness of the proposed algorithm. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

9. A Smooth Double Proximal Primal-Dual Algorithm for a Class of Distributed Nonsmooth Optimization Problems.

Author: Wei, Yue, Fang, Hao, Zeng, Xianlin, Chen, Jie, and Pardalos, Panos
Subjects: *NONSMOOTH optimization, *ALGORITHMS, *DISTRIBUTED algorithms, *COST functions, *CONVEX functions, *MATHEMATICAL optimization
Abstract: This technical note studies a class of distributed nonsmooth convex consensus optimization problems. The cost function is a summation of local cost functions which are convex but nonsmooth. Each of the local cost functions consists of a twice differentiable (smooth) convex function and two lower semi-continuous (nonsmooth) convex functions. We call these problems as single-smooth plus double-nonsmooth (SSDN) problems. Under mild conditions, we propose a distributed double proximal primal-dual optimization algorithm. Double proximal splitting is designed to deal with the difficulty caused by the unproximable property of the summation of those two nonsmooth functions. Besides, it can also guarantee that the proposed algorithm is locally Lipschitz continuous. An auxiliary variable in the double proximal splitting is introduced to estimate the subgradient of the second nonsmooth function. Theoretically, we conduct the convergence analysis by employing Lyapunov stability theory. It shows that the proposed algorithm can make the states achieve consensus at the optimal point. In the end, nontrivial simulations are presented and the results demonstrate the effectiveness of the proposed algorithm. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

10. Dual Averaging Push for Distributed Convex Optimization Over Time-Varying Directed Graph.

Author: Liang, Shu, Wang, Le Yi, and Yin, George
Subjects: *ALGORITHMS, *SUBGRADIENT methods, *DISTRIBUTED algorithms, *DIRECTED graphs, *NONSMOOTH optimization, *MULTIAGENT systems, *CONVEX functions
Abstract: Inspired by the subgradient push method developed recently by Nedić et al. we present a distributed dual averaging push algorithm for constrained nonsmooth convex optimization over time-varying directed graph. Our algorithm combines the dual averaging method with the push-sum technique and achieves an $O(1/ \sqrt{k})$ convergence rate. Compared with the subgradient push algorithm, our algorithm, first, addresses the constrained problems, and, second, has a faster convergence rate, and, third, simplifies the convergence analysis. We also generalize the proposed algorithm so that input variables of subgradient oracles have guaranteed convergence. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

11. Push-Sum on Random Graphs: Almost Sure Convergence and Convergence Rate.

Author: Rezaienia, Pouya, Gharesifard, Bahman, Linder, Tamas, and Touri, Behrouz
Subjects: *ALGORITHMS, *DIRECTED graphs, *HEURISTIC algorithms, *TELECOMMUNICATION systems, *MATHEMATICAL optimization, *RANDOM graphs
Abstract: In this paper, we study the problem of achieving average consensus over a random time-varying sequence of directed graphs by extending the class of so-called push-sum algorithms to such random scenarios. Provided that an ergodicity notion, which we term the directed infinite flow property, holds and the auxiliary states of agents are uniformly bounded away from zero infinitely often, we prove the almost sure convergence of the evolutions of this class of algorithms to the average of initial states. Moreover, for a random sequence of graphs generated using a so-called time-varying $B$ -irreducible probability matrix, we establish convergence rates for the proposed push-sum algorithm. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

12. Snake: A Stochastic Proximal Gradient Algorithm for Regularized Problems Over Large Graphs.

Author: Salim, Adil, Bianchi, Pascal, and Hachem, Walid
Subjects: *MACHINE learning, *MATHEMATICAL optimization, *GRAPH theory, *GEOMETRIC vertices, *NUMERICAL analysis
Abstract: A regularized optimization problem over a large unstructured graph is studied, where the regularization term is tied to the graph geometry. Typical regularization examples include the total variation and the Laplacian regularizations over the graph. When the graph is a simple path without loops, efficient off-the-shelf algorithms can be used. However, when the graph is large and unstructured, such algorithms cannot be used directly. In this paper, an algorithm, referred to as “Snake,” is proposed to solve such regularized problems over general graphs. The algorithm consists in properly selecting random simple paths in the graph and performing the proximal gradient algorithm over these simple paths. This algorithm is an instance of a new general stochastic proximal gradient algorithm, whose convergence is proven. Applications to trend filtering and graph inpainting are provided among others. Numerical experiments are conducted over large graphs. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

13. A Regularized Variable Projection Algorithm for Separable Nonlinear Least-Squares Problems.

Author: Chen, Guang-Yong, Gan, Min, Chen, C. L. Philip, and Li, Han-Xiong
Subjects: *NONLINEAR analysis, *NONLINEAR systems, *JACOBIAN matrices, *MATHEMATICAL optimization, *COMPUTER simulation
Abstract: Separable nonlinear least-squares (SNLLS) problems arise frequently in many research fields, such as system identification and machine learning. The variable projection (VP) method is a very powerful tool for solving such problems. In this paper, we consider the regularization of ill-conditioned SNLLS problems based on the VP method. Selecting an appropriate regularization parameter is difficult because of the nonlinear optimization procedure. We propose to determine the regularization parameter using the weighted generalized cross-validation method at every iteration. This makes the original objective function changing during the optimization procedure. To circumvent this problem, we use an inequation to produce a consistent demand of decreasing at successive iterations. The approximation of the Jacobian of the regularized problem is also discussed. The proposed regularized VP algorithm is tested by the parameter estimation problem of several statistical models. Numerical results demonstrate the effectiveness of the proposed algorithm. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

14. Worst-Case Prediction Performance Analysis of the Kalman Filter.

Author: Yasini, Sholeh and Pelckmans, Kristiaan
Subjects: *KALMAN filtering, *GAME theory, *MACHINE learning, *PREDICTION models, *STOCHASTIC processes
Abstract: In this paper, we study the prediction performance of the Kalman filter (KF) in a worst case minimax setting as studied in online machine learning, information, and game theory. The aim is to predict the sequence of observations almost as well as the best reference predictor (comparator) sequence in a comparison class. We prove worst-case bounds on the cumulative squared prediction errors using a priori knowledge about the complexity of reference predictor sequence. In fact, the performance of the KF is derived as a function of the performance of the best reference predictor and the total amount of drift that occurs in the schedule of the best comparator. [ABSTRACT FROM AUTHOR]
Published: 2018
Full Text: View/download PDF

15. Analysis of Gradient Descent Methods With Nondiminishing Bounded Errors.

Author: Ramaswamy, Arunselvan and Bhatnagar, Shalabh
Subjects: *RADIO frequency, *STOCHASTIC convergence, *CONJUGATE gradient methods, *DIFFERENTIAL inclusions, *MACHINE learning
Abstract: The main aim of this paper is to provide an analysis of gradient descent ( $\text{GD}$ ) algorithms with gradient errors that do not necessarily vanish, asymptotically. In particular, sufficient conditions are presented for both stability (almost sure boundedness of the iterates) and convergence of $\text{GD}$ with bounded (possibly) nondiminishing gradient errors. In addition to ensuring stability, such an algorithm is shown to converge to a small neighborhood of the minimum set, which depends on the gradient errors. It is worth noting that the main result of this paper can be used to show that $\text{GD}$ with asymptotically vanishing errors indeed converges to the minimum set. The results presented herein are not only more general when compared to previous results, but our analysis of $\text{GD}$ with errors is new to the literature to the best of our knowledge. Our work extends the contributions of Mangasarian and Solodov, Bertsekas and Tsitsiklis, and Tadić and Doucet. Using our framework, a simple yet effective implementation of $\text{GD}$ using simultaneous perturbation stochastic approximations, with constant sensitivity parameters, is presented. Another important improvement over many previous results is that there are no “additional” restrictions imposed on the step sizes. In machine learning applications where step sizes are related to learning rates, our assumptions, unlike those of other papers, do not affect these learning rates. Finally, we present experimental results to validate our theory. [ABSTRACT FROM AUTHOR]
Published: 2018
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

15 results on '"Machine Learning Algorithms"'

1. Probably Approximately Correct Learning in Adversarial Environments With Temporal Logic Specifications.

2. “Second-Order Primal” + “First-Order Dual” Dynamical Systems With Time Scaling for Linear Equality Constrained Convex Optimization Problems.

3. An Optimal Computing Budget Allocation Tree Policy for Monte Carlo Tree Search.

4. Distributed Randomized Gradient-Free Mirror Descent Algorithm for Constrained Optimization.

5. Differential Temporal Difference Learning.

6. Constrained Cross-Entropy Method for Safe Reinforcement Learning.

7. A Second-Order Proximal Algorithm for Consensus Optimization.

8. Distributed Proximal Algorithms for Multiagent Optimization With Coupled Inequality Constraints.

9. A Smooth Double Proximal Primal-Dual Algorithm for a Class of Distributed Nonsmooth Optimization Problems.

10. Dual Averaging Push for Distributed Convex Optimization Over Time-Varying Directed Graph.

11. Push-Sum on Random Graphs: Almost Sure Convergence and Convergence Rate.

12. Snake: A Stochastic Proximal Gradient Algorithm for Regularized Problems Over Large Graphs.

13. A Regularized Variable Projection Algorithm for Separable Nonlinear Least-Squares Problems.

14. Worst-Case Prediction Performance Analysis of the Kalman Filter.

15. Analysis of Gradient Descent Methods With Nondiminishing Bounded Errors.

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Database

15 results on '"Machine Learning Algorithms"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources