Author: "Li, Qianxiao" / Topic: machine learning (cs.lg) - Searchworks@Jio Institute Digital Library Search Results

1. Inverse Approximation Theory for Nonlinear Recurrent Neural Networks

Author: Wang, Shida, Li, Zhong, and Li, Qianxiao
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, FOS: Mathematics, Dynamical Systems (math.DS), Mathematics - Dynamical Systems, Machine Learning (cs.LG)
Abstract: We prove an inverse approximation theorem for the approximation of nonlinear sequence-to-sequence relationships using RNNs. This is a so-called Bernstein-type result in approximation theory, which deduces properties of a target function under the assumption that it can be effectively approximated by a hypothesis space. In particular, we show that nonlinear sequence relationships, viewed as functional sequences, that can be stably approximated by RNNs with hardtanh/tanh activations must have an exponential decaying memory structure -- a notion that can be made precise. This extends the previously identified curse of memory in linear RNNs into the general nonlinear setting, and quantifies the essential limitations of the RNN architecture for learning sequential relationships with long-term memory. Based on the analysis, we propose a principled reparameterization method to overcome the limitations. Our theoretical results are confirmed by numerical experiments.
Published: 2023

2. Forward and Inverse Approximation Theory for Linear Temporal Convolutional Networks

Author: Jiang, Haotian and Li, Qianxiao
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Machine Learning (cs.LG)
Abstract: We present a theoretical analysis of the approximation properties of convolutional architectures when applied to the modeling of temporal sequences. Specifically, we prove an approximation rate estimate (Jackson-type result) and an inverse approximation theorem (Bernstein-type result), which together provide a comprehensive characterization of the types of sequential relationships that can be efficiently captured by a temporal convolutional architecture. The rate estimate improves upon a previous result via the introduction of a refined complexity measure, whereas the inverse approximation theorem is new.
Published: 2023

3. Approximation theory of transformer networks for sequence modeling

Author: Jiang, Haotian and Li, Qianxiao
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Machine Learning (cs.LG)
Abstract: The transformer is a widely applied architecture in sequence modeling applications, but the theoretical understanding of its working principles is limited. In this work, we investigate the ability of transformers to approximate sequential relationships. We first prove a universal approximation theorem for the transformer hypothesis space. From its derivation, we identify a novel notion of regularity under which we can prove an explicit approximation rate estimate. This estimate reveals key structural properties of the transformer and suggests the types of sequence relationships that the transformer is adapted to approximating. In particular, it allows us to concretely discuss the structural bias between the transformer and classical sequence modeling methods, such as recurrent neural networks. Our findings are supported by numerical experiments.
Published: 2023
Full Text: View/download PDF

4. A Brief Survey on the Approximation Theory for Sequence Modelling

Author: Jiang, Haotian, Li, Qianxiao, Li, Zhong, and Wang, Shida
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Machine Learning (cs.LG)
Abstract: We survey current developments in the approximation theory of sequence modelling in machine learning. Particular emphasis is placed on classifying existing results for various model architectures through the lens of classical approximation paradigms, and the insights one can gain from these results. We also outline some future research directions towards building a theory of sequence modelling.
Published: 2023
Full Text: View/download PDF

5. A Recursively Recurrent Neural Network (R2N2) Architecture for Learning Iterative Algorithms

Author: Doncevic, Danimir, Mitsos, Alexander, Guo, Yue, Li, Qianxiao, Dietrich, Felix, Dahmen, Manuel, and Kevrekidis, Ioannis G.
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Computer and information sciences [FOS], FOS: Mathematics, Numerical Analysis (math.NA), Mathematics - Numerical Analysis, Mathematics [FOS], Machine Learning (cs.LG)
Abstract: Meta-learning of numerical algorithms for a given task consists of the data-driven identification and adaptation of an algorithmic structure and the associated hyperparameters. To limit the complexity of the meta-learning problem, neural architectures with a certain inductive bias towards favorable algorithmic structures can, and should, be used. We generalize our previously introduced Runge-Kutta neural network to a recursively recurrent neural network (R2N2) superstructure for the design of customized iterative algorithms. In contrast to off-the-shelf deep learning approaches, it features a distinct division into modules for generation of information and for the subsequent assembly of this information towards a solution. Local information in the form of a subspace is generated by subordinate, inner, iterations of recurrent function evaluations starting at the current outer iterate. The update to the next outer iterate is computed as a linear combination of these evaluations, reducing the residual in this space, and constitutes the output of the network. We demonstrate that regular training of the weight parameters inside the proposed superstructure on input/output data of various computational problem classes yields iterations similar to Krylov solvers for linear equation systems, Newton-Krylov solvers for nonlinear equation systems, and Runge-Kutta integrators for ordinary differential equations. Due to its modularity, the superstructure can be readily extended with functionalities needed to represent more general classes of iterative algorithms traditionally based on Taylor series expansions., manuscript (22 pages, 9 figures), supporting information (11 pages, 9 figures)
Published: 2022
Full Text: View/download PDF

6. Amata: An Annealing Mechanism for Adversarial Training Acceleration

Author: Ye, Nanyang, Li, Qianxiao, Zhou, Xiao-Yun, and Zhu, Zhanxing
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, General Medicine, Machine Learning (cs.LG)
Abstract: Despite the empirical success in various domains, it has been revealed that deep neural networks are vulnerable to maliciously perturbed input data that much degrade their performance. This is known as adversarial attacks. To counter adversarial attacks, adversarial training formulated as a form of robust optimization has been demonstrated to be effective. However, conducting adversarial training brings much computational overhead compared with standard training. In order to reduce the computational cost, we propose an annealing mechanism, Amata, to reduce the overhead associated with adversarial training. The proposed Amata is provably convergent, well-motivated from the lens of optimal control theory and can be combined with existing acceleration methods to further enhance performance. It is demonstrated that on standard datasets, Amata can achieve similar or better robustness with around 1/3 to 1/2 the computational time compared with traditional methods. In addition, Amata can be incorporated into other adversarial training acceleration algorithms (e.g. YOPO, Free, Fast, and ATTA), which leads to further reduction in computational time on large-scale problems.
Published: 2021
Full Text: View/download PDF

7. Principled Acceleration of Iterative Numerical Methods Using Machine Learning

Author: Arisaka, Sohei and Li, Qianxiao
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, FOS: Mathematics, Numerical Analysis (math.NA), Mathematics - Numerical Analysis, 65M06, 68T99, 68U20, Machine Learning (cs.LG)
Abstract: Iterative methods are ubiquitous in large-scale scientific computing applications, and a number of approaches based on meta-learning have been recently proposed to accelerate them. However, a systematic study of these approaches and how they differ from meta-learning is lacking. In this paper, we propose a framework to analyze such learning-based acceleration approaches, where one can immediately identify a departure from classical meta-learning. We show that this departure may lead to arbitrary deterioration of model performance. Based on our analysis, we introduce a novel training method for learning-based acceleration of iterative methods. Furthermore, we theoretically prove that the proposed method improves upon the existing methods, and demonstrate its significant advantage and versatility through various numerical applications.
Published: 2022

8. Tackling Data Scarcity with Transfer Learning: A Case Study of Thickness Characterization from Optical Spectra of Perovskite Thin Films

Author: Tian, Siyu Isaac Parker, Ren, Zekun, Venkataraj, Selvaraj, Cheng, Yuanhang, Bash, Daniil, Oviedo, Felipe, Senthilnath, J., Chellappan, Vijila, Lim, Yee-Fun, Aberle, Armin G., MacLeod, Benjamin P, Parlane, Fraser G. L., Berlinguette, Curtis P., Li, Qianxiao, Buonassisi, Tonio, and Liu, Zhe
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Condensed Matter - Materials Science, Image and Video Processing (eess.IV), FOS: Electrical engineering, electronic engineering, information engineering, Materials Science (cond-mat.mtrl-sci), FOS: Physical sciences, Electrical Engineering and Systems Science - Image and Video Processing, Physics - Optics, Machine Learning (cs.LG), Optics (physics.optics)
Abstract: Transfer learning increasingly becomes an important tool in handling data scarcity often encountered in machine learning. In the application of high-throughput thickness as a downstream process of the high-throughput optimization of optoelectronic thin films with autonomous workflows, data scarcity occurs especially for new materials. To achieve high-throughput thickness characterization, we propose a machine learning model called thicknessML that predicts thickness from UV-Vis spectrophotometry input and an overarching transfer learning workflow. We demonstrate the transfer learning workflow from generic source domain of generic band-gapped materials to specific target domain of perovskite materials, where the target domain data only come from limited number (18) of refractive indices from literature. The target domain can be easily extended to other material classes with a few literature data. Defining thickness prediction accuracy to be within-10% deviation, thicknessML achieves 92.2% (with a deviation of 3.6%) accuracy with transfer learning compared to 81.8% (with a deviation of 3.6%) 11.7% without (lower mean and larger standard deviation). Experimental validation on six deposited perovskite films also corroborates the efficacy of the proposed workflow by yielding a 10.5% mean absolute percentage error (MAPE).
Published: 2022
Full Text: View/download PDF

9. Self-Healing Robust Neural Networks via Closed-Loop Control

Author: Chen, Zhuotong, Li, Qianxiao, and Zhang, Zheng
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, Machine Learning (cs.LG)
Abstract: Despite the wide applications of neural networks, there have been increasing concerns about their vulnerability issue. While numerous attack and defense techniques have been developed, this work investigates the robustness issue from a new angle: can we design a self-healing neural network that can automatically detect and fix the vulnerability issue by itself? A typical self-healing mechanism is the immune system of a human body. This biology-inspired idea has been used in many engineering designs but is rarely investigated in deep learning. This paper considers the post-training self-healing of a neural network, and proposes a closed-loop control formulation to automatically detect and fix the errors caused by various attacks or perturbations. We provide a margin-based analysis to explain how this formulation can improve the robustness of a classifier. To speed up the inference of the proposed self-healing network, we solve the control problem via improving the Pontryagin Maximum Principle-based solver. Lastly, we present an error estimation of the proposed framework for neural networks with nonlinear activation functions. We validate the performance on several network architectures against various perturbations. Since the self-healing method does not need a-priori information about data perturbations/attacks, it can handle a broad class of unforeseen perturbations., Comment: 48 pages, 5 figures
Published: 2022
Full Text: View/download PDF

10. Deep Neural Network Approximation of Invariant Functions through Dynamical Systems

Author: Li, Qianxiao, Lin, Ting, and Shen, Zuowei
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Optimization and Control (math.OC), FOS: Mathematics, Dynamical Systems (math.DS), Mathematics - Dynamical Systems, Mathematics - Optimization and Control, Machine Learning (cs.LG)
Abstract: We study the approximation of functions which are invariant with respect to certain permutations of the input indices using flow maps of dynamical systems. Such invariant functions includes the much studied translation-invariant ones involving image tasks, but also encompasses many permutation-invariant functions that finds emerging applications in science and engineering. We prove sufficient conditions for universal approximation of these functions by a controlled equivariant dynamical system, which can be viewed as a general abstraction of deep residual networks with symmetry constraints. These results not only imply the universal approximation for a variety of commonly employed neural network architectures for symmetric function approximation, but also guide the design of architectures with approximation guarantees for applications involving new symmetry requirements.
Published: 2022
Full Text: View/download PDF

11. Towards Robust Neural Networks via Close-loop Control

Author: Chen, Zhuotong, Li, Qianxiao, and Zhang, Zheng
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, Machine Learning (cs.LG)
Abstract: Despite their success in massive engineering applications, deep neural networks are vulnerable to various perturbations due to their black-box nature. Recent study has shown that a deep neural network can misclassify the data even if the input data is perturbed by an imperceptible amount. In this paper, we address the robustness issue of neural networks by a novel close-loop control method from the perspective of dynamic systems. Instead of modifying the parameters in a fixed neural network architecture, a close-loop control process is added to generate control signals adaptively for the perturbed or corrupted data. We connect the robustness of neural networks with optimal control using the geometrical information of underlying data to design the control objective. The detailed analysis shows how the embedding manifolds of state trajectory affect error estimation of the proposed method. Our approach can simultaneously maintain the performance on clean data and improve the robustness against many types of data perturbations. It can also further improve the performance of robustly trained neural networks against different perturbations. To the best of our knowledge, this is the first work that improves the robustness of neural networks with close-loop control., Comment: Published as a conference paper at ICLR 2021
Published: 2021
Full Text: View/download PDF

12. Approximation Theory of Convolutional Architectures for Time Series Modelling

Author: Jiang, Haotian, Li, Zhong, and Li, Qianxiao
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Machine Learning, I.2.6, Machine Learning (stat.ML), 68W25, 68T07, 37M10, Machine Learning (cs.LG)
Abstract: We study the approximation properties of convolutional architectures applied to time series modelling, which can be formulated mathematically as a functional approximation problem. In the recurrent setting, recent results reveal an intricate connection between approximation efficiency and memory structures in the data generation process. In this paper, we derive parallel results for convolutional architectures, with WaveNet being a prime example. Our results reveal that in this new setting, approximation efficiency is not only characterised by memory, but also additional fine structures in the target relationship. This leads to a novel definition of spectrum-based regularity that measures the complexity of temporal relationships under the convolutional approximation scheme. These analyses provide a foundation to understand the differences between architectural choices for time series modelling and can give theoretically grounded guidance for practical applications., Comment: Published version
Published: 2021
Full Text: View/download PDF

13. Collaborative Inference for Efficient Remote Monitoring

Author: Zhang, Chi, Soh, Yong Sheng, Feng, Ling, Zhou, Tianyi, and Li, Qianxiao
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Machine Learning, Machine Learning (stat.ML), Machine Learning (cs.LG)
Abstract: While current machine learning models have impressive performance over a wide range of applications, their large size and complexity render them unsuitable for tasks such as remote monitoring on edge devices with limited storage and computational power. A naive approach to resolve this on the model level is to use simpler architectures, but this sacrifices prediction accuracy and is unsuitable for monitoring applications requiring accurate detection of the onset of adverse events. In this paper, we propose an alternative solution to this problem by decomposing the predictive model as the sum of a simple function which serves as a local monitoring tool, and a complex correction term to be evaluated on the server. A sign requirement is imposed on the latter to ensure that the local monitoring function is safe, in the sense that it can effectively serve as an early warning system. Our analysis quantifies the trade-offs between model complexity and performance, and serves as a guidance for architecture design. We validate our proposed framework on a series of monitoring experiments, where we succeed at learning monitoring models with significantly reduced complexity that minimally violate the safety requirement. More broadly, our framework is useful for learning classifiers in applications where false negatives are significantly more costly compared to false positives.
Published: 2020

14. A Data Driven Method for Computing Quasipotentials

Author: Lin, Bo, Li, Qianxiao, and Ren, Weiqing
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Nuclear Theory, FOS: Mathematics, Dynamical Systems (math.DS), Mathematics - Dynamical Systems, Machine Learning (cs.LG)
Abstract: The quasipotential is a natural generalization of the concept of energy functions to non-equilibrium systems. In the analysis of rare events in stochastic dynamics, it plays a central role in characterizing the statistics of transition events and the likely transition paths. However, computing the quasipotential is challenging, especially in high dimensional dynamical systems where a global landscape is sought. Traditional methods based on the dynamic programming principle or path space minimization tend to suffer from the curse of dimensionality. In this paper, we propose a simple and efficient machine learning method to resolve this problem. The key idea is to learn an orthogonal decomposition of the vector field that drives the dynamics, from which one can identify the quasipotential. We demonstrate on various example systems that our method can effectively compute quasipotential landscapes without requiring spatial discretization or solving path-space optimization problems. Moreover, the method is purely data driven in the sense that only observed trajectories of the dynamics are required for the computation of the quasipotential. These properties make it a promising method to enable the general application of quasipotential analysis to dynamical systems away from equilibrium.
Published: 2020
Full Text: View/download PDF

15. On the Curse of Memory in Recurrent Neural Networks: Approximation and Optimization Analysis

Author: Li, Zhong, Han, Jiequn, E, Weinan, and Li, Qianxiao
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Machine Learning, Optimization and Control (math.OC), I.2.6, FOS: Mathematics, Machine Learning (stat.ML), 68W25, 68T07, 37M10, Mathematics - Optimization and Control, Machine Learning (cs.LG)
Abstract: We study the approximation properties and optimization dynamics of recurrent neural networks (RNNs) when applied to learn input-output relationships in temporal data. We consider the simple but representative setting of using continuous-time linear RNNs to learn from data generated by linear relationships. Mathematically, the latter can be understood as a sequence of linear functionals. We prove a universal approximation theorem of such linear functionals, and characterize the approximation rate and its relation with memory. Moreover, we perform a fine-grained dynamical analysis of training linear RNNs, which further reveal the intricate interactions between memory and learning. A unifying theme uncovered is the non-trivial effect of memory, a notion that can be made precise in our framework, on approximation and optimization: when there is long term memory in the target, it takes a large number of neurons to approximate it. Moreover, the training process will suffer from slow downs. In particular, both of these effects become exponentially more pronounced with memory - a phenomenon we call the "curse of memory". These analyses represent a basic step towards a concrete mathematical understanding of new phenomenon that may arise in learning temporal relationships using recurrent architectures., Comment: Published version
Published: 2020
Full Text: View/download PDF

16. Optimization in Machine Learning: A Distribution Space Approach

Author: Cai, Yongqiang, Li, Qianxiao, and Shen, Zuowei
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Machine Learning, Machine Learning (stat.ML), Machine Learning (cs.LG)
Abstract: We present the viewpoint that optimization problems encountered in machine learning can often be interpreted as minimizing a convex functional over a function space, but with a non-convex constraint set introduced by model parameterization. This observation allows us to repose such problems via a suitable relaxation as convex optimization problems in the space of distributions over the training parameters. We derive some simple relationships between the distribution-space problem and the original problem, e.g. a distribution-space solution is at least as good as a solution in the original space. Moreover, we develop a numerical algorithm based on mixture distributions to perform approximate optimization directly in distribution space. Consistency of this approximation is established and the numerical efficacy of the proposed algorithm is illustrated on simple examples. In both theory and practice, this formulation provides an alternative approach to large-scale optimization in machine learning., Comment: 26 pages, 12 figures
Published: 2020
Full Text: View/download PDF

17. Deep Learning via Dynamical Systems: An Approximation Perspective

Author: Li, Qianxiao, Lin, Ting, and Shen, Zuowei
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Optimization and Control (math.OC), Statistics - Machine Learning, Applied Mathematics, General Mathematics, FOS: Mathematics, Machine Learning (stat.ML), Mathematics - Optimization and Control, Machine Learning (cs.LG)
Abstract: We build on the dynamical systems approach to deep learning, where deep residual networks are idealized as continuous-time dynamical systems, from the approximation perspective. In particular, we establish general sufficient conditions for universal approximation using continuous-time deep residual networks, which can also be understood as approximation theories in $L^p$ using flow maps of dynamical systems. In specific cases, rates of approximation in terms of the time horizon are also established. Overall, these results reveal that composition function approximation through flow maps present a new paradigm in approximation theory and contributes to building a useful mathematical framework to investigate deep learning., Revision 1
Published: 2019

18. An Optimal Control Approach to Deep Learning and Applications to Discrete-Weight Neural Networks

Author: Li, Qianxiao and Hao, Shuji
Subjects: FOS: Computer and information sciences, Computer Science - Learning, Machine Learning (cs.LG)
Abstract: Deep learning is formulated as a discrete-time optimal control problem. This allows one to characterize necessary conditions for optimality and develop training algorithms that do not rely on gradients with respect to the trainable parameters. In particular, we introduce the discrete-time method of successive approximations (MSA), which is based on the Pontryagin's maximum principle, for training neural networks. A rigorous error estimate for the discrete MSA is obtained, which sheds light on its dynamics and the means to stabilize the algorithm. The developed methods are applied to train, in a rather principled way, neural networks with weights that are constrained to take values in a discrete set. We obtain competitive performance and interestingly, very sparse weights in the case of ternary networks, which may be useful in model deployment in low-memory devices.
Published: 2018
Full Text: View/download PDF

19. Maximum Principle Based Algorithms for Deep Learning

Author: Li, Qianxiao, Chen, Long, Tai, Cheng, and E, Weinan
Subjects: FOS: Computer and information sciences, Computer Science - Learning, 68T05, 49D, 49M, Statistics - Machine Learning, Machine Learning (stat.ML), Machine Learning (cs.LG)
Abstract: The continuous dynamical system approach to deep learning is explored in order to devise alternative frameworks for training algorithms. Training is recast as a control problem and this allows us to formulate necessary optimality conditions in continuous time using the Pontryagin's maximum principle (PMP). A modification of the method of successive approximations is then used to solve the PMP, giving rise to an alternative training algorithm for deep learning. This approach has the advantage that rigorous error estimates and convergence results can be established. We also show that it may avoid some pitfalls of gradient-based methods, such as slow convergence on flat landscapes near saddle points. Furthermore, we demonstrate that it obtains favorable initial convergence rate per-iteration, provided Hamiltonian maximization can be efficiently carried out - a step which is still in need of improvement. Overall, the approach opens up new avenues to attack problems associated with deep learning, such as trapping in slow manifolds and inapplicability of gradient-based methods for discrete trainable variables., Comment: Published version
Published: 2017
Full Text: View/download PDF

20. Stochastic modified equations and adaptive stochastic gradient algorithms

Author: Li, Qianxiao, Tai, Cheng, and E, Weinan
Subjects: FOS: Computer and information sciences, Computer Science - Learning, Statistics - Machine Learning, MathematicsofComputing_NUMERICALANALYSIS, Machine Learning (stat.ML), 68W20, Machine Learning (cs.LG)
Abstract: We develop the method of stochastic modified equations (SME), in which stochastic gradient algorithms are approximated in the weak sense by continuous-time stochastic differential equations. We exploit the continuous formulation together with optimal control theory to derive novel adaptive hyper-parameter adjustment policies. Our algorithms have competitive performance with the added benefit of being robust to varying models and datasets. This provides a general methodology for the analysis and design of stochastic gradient algorithms., Comment: Major changes including a proof of the weak approximation, asymptotic expansions and application-oriented adaptive algorithms
Published: 2015
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

20 results on '"Li, Qianxiao"'

1. Inverse Approximation Theory for Nonlinear Recurrent Neural Networks

2. Forward and Inverse Approximation Theory for Linear Temporal Convolutional Networks

3. Approximation theory of transformer networks for sequence modeling

4. A Brief Survey on the Approximation Theory for Sequence Modelling

5. A Recursively Recurrent Neural Network (R2N2) Architecture for Learning Iterative Algorithms

6. Amata: An Annealing Mechanism for Adversarial Training Acceleration

7. Principled Acceleration of Iterative Numerical Methods Using Machine Learning

8. Tackling Data Scarcity with Transfer Learning: A Case Study of Thickness Characterization from Optical Spectra of Perovskite Thin Films

9. Self-Healing Robust Neural Networks via Closed-Loop Control

10. Deep Neural Network Approximation of Invariant Functions through Dynamical Systems

11. Towards Robust Neural Networks via Close-loop Control

12. Approximation Theory of Convolutional Architectures for Time Series Modelling

13. Collaborative Inference for Efficient Remote Monitoring

14. A Data Driven Method for Computing Quasipotentials

15. On the Curse of Memory in Recurrent Neural Networks: Approximation and Optimization Analysis

16. Optimization in Machine Learning: A Distribution Space Approach

17. Deep Learning via Dynamical Systems: An Approximation Perspective

18. An Optimal Control Approach to Deep Learning and Applications to Discrete-Weight Neural Networks

19. Maximum Principle Based Algorithms for Deep Learning

20. Stochastic modified equations and adaptive stochastic gradient algorithms

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Database

Publisher

20 results on '"Li, Qianxiao"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources