652 results on '"Numerical linear algebra"'
Search Results
2. MAGMA: Enabling exascale performance with accelerated BLAS and LAPACK for diverse GPU architectures.
- Author
-
Abdelfattah, Ahmad, Beams, Natalie, Carson, Robert, Ghysels, Pieter, Kolev, Tzanio, Stitt, Thomas, Vargas, Arturo, Tomov, Stanimire, and Dongarra, Jack
- Subjects
- *
NUMERICAL solutions for linear algebra , *MATRICES (Mathematics) , *LINEAR algebra , *LANDSCAPE architecture , *MAGMAS - Abstract
MAGMA (Matrix Algebra for GPU and Multicore Architectures) is a pivotal open-source library in the landscape of GPU-enabled dense and sparse linear algebra computations. With a repertoire of approximately 750 numerical routines across four precisions, MAGMA is deeply ingrained in the DOE software stack, playing a crucial role in high-performance computing. Notable projects such as ExaConstit, HiOP, MARBL, and STRUMPACK, among others, directly harness the capabilities of MAGMA. In addition, the MAGMA development team has been acknowledged multiple times for contributing to the vendors' numerical software stacks. Looking back over the time of the Exascale Computing Project (ECP), we highlight how MAGMA has adapted to recent changes in modern HPC systems, especially the growing gap between CPU and GPU compute capabilities, as well as the introduction of low precision arithmetic in modern GPUs. We also describe MAGMA's direct impact on several ECP projects. Maintaining portable performance across NVIDIA and AMD GPUs, and with current efforts toward supporting Intel GPUs, MAGMA ensures its adaptability and relevance in the ever-evolving landscape of GPU architectures. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. Randomized low‐rank approximation of parameter‐dependent matrices.
- Author
-
Kressner, Daniel and Lam, Hei Yin
- Abstract
This work considers the low‐rank approximation of a matrix A(t)$$ A(t) $$ depending on a parameter t$$ t $$ in a compact set D⊂ℝd$$ D\subset {\mathbb{R}}^d $$. Application areas that give rise to such problems include computational statistics and dynamical systems. Randomized algorithms are an increasingly popular approach for performing low‐rank approximation and they usually proceed by multiplying the matrix with random dimension reduction matrices (DRMs). Applying such algorithms directly to A(t)$$ A(t) $$ would involve different, independent DRMs for every t$$ t $$, which is not only expensive but also leads to inherently non‐smooth approximations. In this work, we propose to use constant DRMs, that is, A(t)$$ A(t) $$ is multiplied with the same DRM for every t$$ t $$. The resulting parameter‐dependent extensions of two popular randomized algorithms, the randomized singular value decomposition and the generalized Nyström method, are computationally attractive, especially when A(t)$$ A(t) $$ admits an affine linear decomposition with respect to t$$ t $$. We perform a probabilistic analysis for both algorithms, deriving bounds on the expected value as well as failure probabilities for the approximation error when using Gaussian random DRMs. Both, the theoretical results and numerical experiments, show that the use of constant DRMs does not impair their effectiveness; our methods reliably return quasi‐best low‐rank approximations. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. GROWTH FACTORS OF ORTHOGONAL MATRICES AND LOCAL BEHAVIOR OF GAUSSIAN ELIMINATION WITH PARTIAL AND COMPLETE PIVOTING.
- Author
-
PECA-MEDLIN, JOHN
- Subjects
- *
NUMERICAL solutions for linear algebra , *GAUSSIAN elimination , *GROWTH factors , *LINEAR systems , *FACTOR analysis - Abstract
Gaussian elimination (GE) is the most used dense linear solver. Error analysis of GE with selected pivoting strategies on well-conditioned systems can focus on studying the behavior of growth factors. Although exponential growth is possible with GE with partial pivoting (GEPP), growth tends to stay much smaller in practice. Support for this behavior was provided recently by Huang and Tikhomirov's average-case analysis of GEPP, which showed GEPP growth factors for Gaussian matrices stay at most polynomial with very high probability. GE with complete pivoting (GECP) has also seen a lot of recent interest, with improvements to both lower and upper bounds on worst-case GECP growth provided by Bisain, Edelman, and Urschel in 2023. We are interested in studying how GEPP and GECP behave on the same linear systems as well as studying large growth on particular subclasses of matrices, including orthogonal matrices. Moreover, as a means to better address the question of why large growth is rarely encountered, we further study matrices with a large difference in growth between using GEPP and GECP, and we explore how the smaller growth strategy dominates behavior in a small neighborhood of the initial matrix. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. On the Sample Complexity of Stabilizing Linear Dynamical Systems from Data.
- Author
-
Werner, Steffen W. R. and Peherstorfer, Benjamin
- Subjects
- *
LINEAR dynamical systems , *DYNAMICAL systems , *SCIENCE education , *NUMERICAL solutions for linear algebra - Abstract
Learning controllers from data for stabilizing dynamical systems typically follows a two-step process of first identifying a model and then constructing a controller based on the identified model. However, learning models means identifying generic descriptions of the dynamics of systems, which can require large amounts of data and extracting information that are unnecessary for the specific task of stabilization. The contribution of this work is to show that if a linear dynamical system has dimension (McMillan degree) n , then there always exist n states from which a stabilizing feedback controller can be constructed, independent of the dimension of the representation of the observed states and the number of inputs. By building on previous work, this finding implies that any linear dynamical system can be stabilized from fewer observed states than the minimal number of states required for learning a model of the dynamics. The theoretical findings are demonstrated with numerical experiments that show the stabilization of the flow behind a cylinder from less data than necessary for learning a model. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. FAST AND ACCURATE RANDOMIZED ALGORITHMS FOR LINEAR SYSTEMS AND EIGENVALUE PROBLEMS.
- Author
-
YUJI NAKATSUKASA and TROPP, JOEL A.
- Subjects
- *
LINEAR systems , *NUMERICAL solutions for linear algebra , *EIGENVALUES - Abstract
This paper develops a class of algorithms for general linear systems and eigenvalue problems. These algorithms apply fast randomized dimension reduction ("sketching") to accelerate standard subspace projection methods, such as GMRES and Rayleigh-Ritz. This modification makes it possible to incorporate nontraditional bases for the approximation subspace that are easier to construct. When the basis is numerically full rank, the new algorithms have accuracy similar to classic methods but run faster and may use less storage. For model problems, numerical experiments show large advantages over the optimized MATLAB routines, including a 70x speedup over gmres and a 10x speedup over eigs. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. ADAPTIVE PRECISION SPARSE MATRIX-VECTOR PRODUCT AND ITS APPLICATION TO KRYLOV SOLVERS.
- Author
-
GRAILLAT, STEF, JÉZÉQUEL, FABIENNE, MARY, THEO, and MOLINA, ROMÉO
- Subjects
- *
SPARSE matrices , *NUMERICAL solutions for linear algebra , *COMPUTER algorithms , *FLOATING-point arithmetic , *LINEAR systems , *ADAPTIVE control systems - Abstract
We introduce a mixed precision algorithm for computing sparse matrix-vector products and use it to accelerate the solution of sparse linear systems by iterative methods. Our approach is based on the idea of adapting the precision of each matrix element to their magnitude: we split the elements into buckets and use progressively lower precisions for the buckets of progressively smaller elements. We carry out a rounding error analysis of this algorithm that provides us with an explicit rule to decide which element goes into which bucket and allows us to rigorously control the accuracy of the algorithm. We implement the algorithm on a multicore computer and obtain significant speedups (up to a factor 7\times) with respect to uniform precision algorithms, without loss of accuracy, on a range of sparse matrices from real-life applications. We showcase the effectiveness of our algorithm by plugging it into various Krylov solvers for sparse linear systems and observe that the convergence of the solution is essentially unaffected by the use of adaptive precision. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. COMMUNICATION AVOIDING BLOCK LOW-RANK PARALLEL MULTIFRONTAL TRIANGULAR SOLVE WITH MANY RIGHT-HAND SIDES.
- Author
-
AMESTOY, PATRICK, BOITEAU, OLIVIER, BUTTARI, ALFREDO, GEREST, MATTHIEU, JÉZÉQUEL, FABIENNE, LÉXCELLENT, JEAN-YVES, and MARY, THEO
- Subjects
- *
NUMERICAL solutions for linear algebra , *PATTERNS (Mathematics) , *SPARSE matrices , *LOW-rank matrices - Abstract
Block low-rank (BLR) compression can significantly reduce the memory and time costs of parallel sparse direct solvers. In this paper, we investigate the performance of the BLR triangular solve phase, which we observe to be underwhelming when dealing with many right-hand sides (RHS). We explain that this is because the bottleneck of the triangular solve is not in accessing the BLR LU factors, but rather in accessing the RHS, which are uncompressed. Motivated by this finding, we propose several new hybrid variants, which combine the right-looking and left-looking communication patterns to minimize the number of accesses to the RHS. We confirm via a theoretical analysis that these new variants can significantly reduce the total communication volume. We assess the impact of this reduction on the time performance on a range of real-life applications using the MUMPS solver, obtaining up to 20% time reduction. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. Numerical Linear Algebra for the Two-Dimensional Bertozzi–Esedoglu–Gillette–Cahn–Hilliard Equation in Image Inpainting.
- Author
-
Awad, Yahia, Fakih, Hussein, and Alkhezi, Yousuf
- Subjects
- *
NUMERICAL solutions for linear algebra , *INPAINTING , *LINEAR algebra , *EQUATIONS , *FINITE difference method - Abstract
In this paper, we present a numerical linear algebra analytical study of some schemes for the Bertozzi–Esedoglu–Gillette–Cahn–Hilliard equation. Both 1D and 2D finite difference discretizations in space are proposed with semi-implicit and implicit discretizations on time. We prove that our proposed numerical solutions converge to continuous solutions. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
10. Prior-Preconditioned Conjugate Gradient Method for Accelerated Gibbs Sampling in "Large n, Large p" Bayesian Sparse Regression.
- Author
-
Nishimura, Akihiko and Suchard, Marc A.
- Subjects
- *
CONJUGATE gradient methods , *GIBBS sampling , *MARKOV chain Monte Carlo , *GAUSSIAN distribution , *NUMERICAL solutions for linear algebra , *LINEAR systems , *ACCELERATED life testing - Abstract
In a modern observational study based on healthcare databases, the number of observations and of predictors typically range in the order of 105–106 and of 104–105. Despite the large sample size, data rarely provide sufficient information to reliably estimate such a large number of parameters. Sparse regression techniques provide potential solutions, one notable approach being the Bayesian method based on shrinkage priors. In the "large n and large p" setting, however, the required posterior computation encounters a bottleneck at repeated sampling from a high-dimensional Gaussian distribution, whose precision matrix Φ is expensive to compute and factorize. In this article, we present a novel algorithm to speed up this bottleneck based on the following observation: We can cheaply generate a random vector b such that the solution to the linear system Φ β = b has the desired Gaussian distribution. We can then solve the linear system by the conjugate gradient (CG) algorithm through matrix-vector multiplications by Φ ; this involves no explicit factorization or calculation of Φ itself. Rapid convergence of CG in this context is guaranteed by the theory of prior-preconditioning we develop. We apply our algorithm to a clinically relevant large-scale observational study with n = 72 , 489 patients and p = 22 , 175 clinical covariates, designed to assess the relative risk of adverse events from two alternative blood anti-coagulants. Our algorithm demonstrates an order of magnitude speed-up in posterior inference, in our case cutting the computation time from two weeks to less than a day. for this article are available online. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
11. Pseudospectral Divide-and-Conquer for the Generalized Eigenvalue Problem
- Author
-
Schneider, Ryan
- Subjects
Applied mathematics ,Eigenvalue problems ,Numerical linear algebra ,Random matrices - Abstract
\indent Over the last two decades, randomization has emerged as a leading tool for pursuing efficiency in numerical linear algebra. Its benefits can be explained in part by smoothed analysis, where an algorithm that fails spectacularly in certain cases may be unlikely to do so on random -- or randomly perturbed -- inputs. This observation implies a simple framework for developing accurate and efficient randomized algorithms:\ apply a random perturbation and run an existing method, whose worst-case error (or run-time) can be avoided with high probability. \\\indent Recent work of Banks, Garza-Vargas, Kulkarni, and Srivastava applied this framework to the standard eigenvalue problem, developing a randomized algorithm that (with high probability) diagonalizes a matrix in nearly matrix multiplication time [Foundations of Computational Math 2022]. Central to their work is the phenomenon of \textit{pseudospectral shattering}, in which a small Gaussian perturbation regularizes the pseudospectrum of a matrix, with high probability breaking it into disjoint components and allowing classical, divide-and-conquer eigensolvers to run successfully. Prior to their work, no way of accessing the benefits of divide-and-conquer’s natural parallelization was known in general. \\\indent In this thesis, we extend the work of Banks et al.\ to the generalized eigenvalue problem -- e.g., $Av = \lambda Bv$ for matrices $A,B \in {\mathbb C}^{n \times n}$. Our main contributions can be summarized as follows.\begin{enumerate}[itemsep=0em] \item First, we show that pseudospectral shattering generalizes directly:\ randomly perturbing $A$ and $B$ has a similar regularizing effect on the pseudospectra of the corresponding matrix pencil $(A,B)$ with high probability. \item Building on pseudospectral shattering, we construct and analyze a fast, randomized, divide-and-conquer algorithm for diagonalizing $(A,B)$, which begins by randomly perturbing the inputs. \item Finally, we demonstrate that both pseudospectral shattering and the corresponding diagonalization algorithm can be adapted to definite pencils, further pursuing efficiency by preserving and exploiting symmetry. \end{enumerate}\indent The resulting algorithm, which we call pseudospectral divide-and-conquer, is the first general, sub-$O(n^3)$ solver for the generalized eigenvalue problem. It is not only highly parallel and capable of accommodating structure, but also promotes stability by avoiding matrix inversion. In essence, this thesis is a handbook for understanding, adapting, and implementing the method.
- Published
- 2024
12. Nonlinear eigenvalue problems : theory and algorithms
- Author
-
Negri Porzio, Gian Maria, Higham, Nicholas, and Tisseur, Francoise
- Subjects
Eigenvalue Problems ,Tropical Algebra ,Tropical Roots ,NLEVP ,Numerical Analysis ,Eigenvalue ,Rational Approximations ,Nonlinear eigenvalue problems ,Numerical Linear Algebra ,Contour Integrals ,Eigenpair - Abstract
In this thesis we focus on the theoretical and computational aspects of nonlinear eigenvalue problems (NEPs), which arise in several fields of computational science and engineering, such as fluid dynamics, optics, and structural engineering. In the last twenty years several researchers devoted their time in studying efficient and precise ways to solve NEPs, which cemented their importance in numerical linear algebra. The most successful algorithms developed towards this goal are either based on contour integrals, or on rational approximations and linearizations. The first part of the thesis is dedicated to contour integral algorithms. In this framework, one computes specific integrals of a holomorphic function G(z) over the contour of a target region and exploits results of complex analysis to retrieve the eigenvalues of G(z) inside it. Our main contribution consists in having expanded the theory to include meromorphic functions, i.e., functions with poles inside the target region. We showed that under some circumstances, these algorithms can mistake a pole for an eigenvalue, but these situations are easily recognised and the main results from the holomorphic case can be extended. Furthermore, we proposed several heuristics to automatically choose the parameters that are needed to precisely retrieve the eigenpairs. In the second part of the thesis, we focus on rational approximations. Our goal was developing algorithms that construct robust, i.e., reliable for a given tolerance and scaling independent, rational approximations for functions given in split form or in black-box form. In the first case, we proposed a variant of the set-valued AAA, named weighted AAA, which guarantees to return an approximation with the chosen accuracy. In the second one, we built a two-phase approach, where an initial step performed by the surrogate AAA is followed by a cyclic Leja-Bagby refinement. We concluded the section with numerous numerical experiments based on the NLEVP library, comparing contour integral and rational approximation algorithms. The third and final part of the thesis is about tropical linear algebra. Our research on this topic started has a way to set the parameters of the aforementioned contour integral algorithms: in order to do that, we extended the theory of tropical roots from tropical polynomials to tropical Laurent series. Unlike in the polynomial case, a tropical Laurent series may have infinite tropical roots, but they are still in bijection with the slopes of the associated Newton polygon and they still provide annuli of exclusion for the eigenvalues of the Laurent series.
- Published
- 2021
13. GROWTH FACTORS OF RANDOM BUTTERFLY MATRICES AND THE STABILITY OF AVOIDING PIVOTING.
- Author
-
PECA-MEDLIN, JOHN and TROGDON, THOMAS
- Subjects
- *
RANDOM matrices , *NUMERICAL solutions for linear algebra , *GROWTH factors , *GAUSSIAN elimination , *LINEAR systems - Abstract
Random butterfly matrices were introduced by Parker in 1995 to remove the need for pivoting when using Gaussian elimination. The growing applications of butterfly matrices have often eclipsed the mathematical understanding of how or why butterfly matrices are able to accomplish these given tasks. To help begin to close this gap using theoretical and numerical approaches, we explore the impact on the growth factor of preconditioning a linear system by butterfly matrices. These results are compared to other common methods found in randomized numerical linear algebra. In these experiments, we show that preconditioning using butterfly matrices has a more significant dampening impact on large growth factors than other common preconditioned and a smaller increase to minimal growth factor systems. Moreover, we are able to determine the full distribution of the growth factors for a subclass of random butterfly matrices. Previous results by Trefethen and Schreiber relating to the distribution of random growth factors were limited to empirical estimates of the first moment for Ginibre matrices. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
14. Mixed precision low-rank approximations and their application to block low-rank LU factorization.
- Author
-
Amestoy, Patrick, Boiteau, Olivier, Buttari, Alfredo, Gerest, Matthieu, Jézéquel, Fabienne, L'Excellent, Jean-Yves, and Mary, Theo
- Subjects
SINGULAR value decomposition ,NUMERICAL solutions for linear algebra ,FACTORIZATION ,FRACTIONS ,ARITHMETIC ,LOW-rank matrices ,SPARSE matrices ,LUTETIUM compounds - Abstract
We introduce a novel approach to exploit mixed precision arithmetic for low-rank approximations. Our approach is based on the observation that singular vectors associated with small singular values can be stored in lower precisions while preserving high accuracy overall. We provide an explicit criterion to determine which level of precision is needed for each singular vector. We apply this approach to block low-rank (BLR) matrices, most of whose off-diagonal blocks have low rank. We propose a new BLR LU factorization algorithm that exploits the mixed precision representation of the blocks. We carry out the rounding error analysis of this algorithm and prove that the use of mixed precision arithmetic does not compromise the numerical stability of the BLR LU factorization. Moreover, our analysis determines which level of precision is needed for each floating-point operation (flop), and therefore guides us toward an implementation that is both robust and efficient. We evaluate the potential of this new algorithm on a range of matrices coming from real-life problems in industrial and academic applications. We show that a large fraction of the entries in the LU factors and flops to perform the BLR LU factorization can be safely switched to lower precisions, leading to significant reductions of the storage and expected time costs, of up to a factor three using fp64, fp32, and bfloat16 arithmetics. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
15. Mixed precision LU factorization on GPU tensor cores: reducing data movement and memory footprint.
- Author
-
Lopez, Florent and Mary, Theo
- Subjects
- *
FACTORIZATION , *NUMERICAL solutions for linear algebra , *LINEAR algebra , *DATA warehousing - Abstract
Modern GPUs equipped with mixed precision tensor core units present great potential to accelerate dense linear algebra operations such as LU factorization. However, state-of-the-art mixed half/single precision LU factorization algorithms all require the matrix to be stored in single precision, leading to expensive data movement and storage costs. This is explained by the fact that simply switching the storage precision from single to half leads to significant loss of accuracy, forfeiting all accuracy benefits from using tensor core technology. In this article, we propose a new factorization algorithm that is able to store the matrix in half precision without incurring any significant loss of accuracy. Our approach is based on a left-looking scheme employing single precision buffers of controlled size and a mixed precision doubly partitioned algorithm exploiting tensor cores in the panel factorizations. Our numerical results show that compared with the state of the art, the proposed approach is of similar accuracy but with only half the data movement and memory footprint, and hence potentially much faster: it achieves up to 2× and 3.5× speedups on V100 and A100 GPUs, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
16. Numerical Linear Algebra for the Two-Dimensional Bertozzi–Esedoglu–Gillette–Cahn–Hilliard Equation in Image Inpainting
- Author
-
Yahia Awad, Hussein Fakih, and Yousuf Alkhezi
- Subjects
Cahn–Hilliard equation ,image inpainting ,finite difference method ,numerical linear algebra ,stability ,steady state ,Mathematics ,QA1-939 - Abstract
In this paper, we present a numerical linear algebra analytical study of some schemes for the Bertozzi–Esedoglu–Gillette–Cahn–Hilliard equation. Both 1D and 2D finite difference discretizations in space are proposed with semi-implicit and implicit discretizations on time. We prove that our proposed numerical solutions converge to continuous solutions.
- Published
- 2023
- Full Text
- View/download PDF
17. Maximum entropy snapshot sampling for reduced basis modelling
- Author
-
Bannenberg, Marcus W.F.M., Kasolis, Fotios, Günther, Michael, and Clemens, Markus
- Published
- 2022
- Full Text
- View/download PDF
18. A globally convergent numerical method with fourth order for computing the matrix sign.
- Author
-
Salehi, Sommayeh and Lotfi, Taher
- Abstract
The target of this article is to propose a novel numerical scheme that can compete its corresponding solvers of the same type for computing the sign matrix. To do this, a three‐step root solver is designed to gain as much as possible of global convergence when finding the matrix sign. It is proved that the novel numerical scheme is of quartical convergence order. The stability of the scheme is discussed in detail as well. Finally, numerical tests are provided to uphold the theoretical findings. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
19. Hierarchical Model Reduction Driven by Machine Learning for Parametric Advection-Diffusion-Reaction Problems in the Presence of Noisy Data.
- Author
-
Lupo Pasini, Massimiliano and Perotto, Simona
- Abstract
We propose a new approach to generate a reliable reduced model for a parametric elliptic problem, in the presence of noisy data. The reference model reduction procedure is the directional HiPOD method, which combines Hierarchical Model reduction with a standard Proper Orthogonal Decomposition, according to an offline/online paradigm. In this paper we show that directional HiPOD looses in terms of accuracy when problem data are affected by noise. This is due to the interpolation driving the online phase, since it replicates, by definition, the noise trend. To overcome this limit, we replace interpolation with Machine Learning fitting models which better discriminate relevant physical features in the data from irrelevant unstructured noise. The numerical assessment, although preliminary, confirms the potentialities of the new approach. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
20. MATRIX MULTIPLICATION IN MULTIWORD ARITHMETIC: ERROR ANALYSIS AND APPLICATION TO GPU TENSOR CORES.
- Author
-
FASI, MASSIMILIANO, HIGHAM, NICHOLAS J., LOPEZ, FLORENT, MARY, THEO, and MIKAITIS, MANTAS
- Subjects
- *
MATRIX multiplications , *NUMERICAL solutions for linear algebra , *ARITHMETIC - Abstract
In multiword arithmetic, a matrix is represented as the unevaluated sum of two or more lower precision matrices, and a matrix product is formed by multiplying the constituents in low precision. We investigate the use of multiword arithmetic for improving the performance-accuracy tradeoff of matrix multiplication with mixed precision block fused multiply--add (FMA) hardware, focusing especially on the tensor cores available on NVIDIA GPUs. Building on a general block FMA framework, we develop a comprehensive error analysis of multiword matrix multiplication. After confirming the theoretical error bounds experimentally by simulating low precision in software, we use the cuBLAS and CUTLASS libraries to implement a number of matrix multiplication algorithms using double-fp16 (double-binary16) arithmetic. When running the algorithms on NVIDIA V100 and A100 GPUs, we find that double-fp16 is not as accurate as fp32 (binary32) arithmetic despite satisfying the same worst-case error bound. Using probabilistic error analysis, we explain why this issue is likely to be caused by the rounding mode used by the NVIDIA tensor cores, and we propose a parameterized blocked summation algorithm that alleviates the problem and significantly improves the performance-accuracy tradeoff. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
21. The INTERNODES method for applications in contact mechanics and dedicated preconditioning techniques.
- Author
-
Voet, Yannis, Anciaux, Guillaume, Deparis, Simone, and Gervasio, Paola
- Subjects
- *
NUMERICAL solutions to partial differential equations , *CONTACT mechanics , *QUADRATURE domains , *FINITE element method , *COMPUTATIONAL mechanics , *LINEAR systems - Abstract
• The INTERNODES method is applied to problems in contact mechanics. • We propose a highly efficient preconditioner to solve the sequence of linear systems. • An algorithm is suggested to cheaply solve linear systems with the preconditioning matrix. • The quality of the preconditioner is assessed on small to medium size 2D problems. The mortar finite element method is a well-established method for the numerical solution of partial differential equations on domains displaying non-conforming interfaces. The method is known for its application in computational contact mechanics. However, its implementation remains challenging as it relies on geometrical projections and unconventional quadrature rules. The INTERNODES (INTERpolation for NOn-conforming DEcompositionS) method, instead, could overcome the implementation difficulties thanks to flexible interpolation techniques. Moreover, it was shown to be at least as accurate as the mortar method making it a very promising alternative for solving problems in contact mechanics. Unfortunately, in such situations the method requires solving a sequence of ill-conditioned linear systems. In this paper, preconditioning techniques are designed and implemented for the efficient solution of those linear systems. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
22. Evolving graphs and similarity-based graphs with applications
- Author
-
Zhang, Weijian, Higham, Nicholas, and Guettel, Stefan
- Subjects
510 ,Numerical linear algebra ,Numerical algorithms ,Natural language processing ,Knowledge graph ,Time-dependent networks ,Network sciences ,Evolving graphs ,Graph centrality - Abstract
A graph is a mathematical structure for modelling the pairwise relations between objects. This thesis studies two types of graphs, namely, similarity-based graphs and evolving graphs. We look at ways to traverse an evolving graph. In particular, we examine the influence of temporal information on node centrality. In the process, we develop EvolvingGraphs.jl, a software package for analyzing time-dependent networks. We develop Etymo, a search system for discovering interesting research papers. Etymo utilizes both similarity-based graphs and evolving graphs to build a knowledge graph of research articles in order to help users to track the development of ideas. We construct content similarity-based graphs using the full text of research papers. And we extract key concepts from research papers and exploit the temporal information in research papers to construct a concepts evolving graph.
- Published
- 2018
23. Point Cloud Video Super-Resolution via Partial Point Coupling and Graph Smoothness.
- Author
-
Dinesh, Chinthaka, Cheung, Gene, and Bajic, Ivan V.
- Subjects
- *
POINT cloud , *DIGITAL video , *BIPARTITE graphs , *QUADRATIC programming , *QUADRATIC forms , *NUMERICAL solutions for linear algebra - Abstract
Point cloud (PC) is a collection of discrete geometric samples of a physical object in 3D space. A PC video consists of temporal frames evenly spaced in time, each containing a static PC at one time instant. PCs in adjacent frames typically do not have point-to-point (P2P) correspondence, and thus exploiting temporal redundancy for PC restoration across frames is difficult. In this paper, we focus on the super-resolution (SR) problem for PC video: increase point density of PCs in video frames while preserving salient geometric features consistently across time. We accomplish this with two ideas. First, we establish partial P2P coupling between PCs of adjacent frames by interpolating interior points in a low-resolution PC patch in frame $t$ and translating them to a corresponding patch in frame $t+1$ , via a motion model computed by iterative closest point (ICP). Second, we promote piecewise smoothness in 3D geometry in each patch using feature graph Laplacian regularizer (FGLR) in an easily computable quadratic form. The two ideas translate to an unconstrained quadratic programming (QP) problem with a system of linear equations as solution—one where we ensure the numerical stability by upper-bounding the condition number of the coefficient matrix. Finally, to improve the accuracy of the ICP motion model, we re-sample points in a super-resolved patch at time $t$ to better match a low-resolution patch at time $t+1$ via bipartite graph matching after each SR iteration. Experimental results show temporally consistent super-resolved PC videos generated by our scheme, outperforming SR competitors that optimized on a per-frame basis, in two established PC metrics. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
24. Algorithm 1022: Efficient Algorithms for Computing a Rank-Revealing UTV Factorization on Parallel Computing Architectures.
- Author
-
HEAVNER, N., IGUAL, F. D., QUINTANA-ORTÍ, G., and MARTINSSON, P. G.
- Subjects
- *
SINGULAR value decomposition , *FACTORIZATION , *PARALLEL processing , *PARALLEL programming , *NUMERICAL solutions for linear algebra , *MATRIX decomposition - Abstract
Randomized singular value decomposition (RSVD) is by now a well-established technique for efficiently computing an approximate singular value decomposition of a matrix. Building on the ideas that underpin RSVD, the recently proposed algorithm "randUTV" computes a full factorization of a given matrix that provides low-rank approximations with near-optimal error. Because the bulk of randUTV is cast in terms of communication-efficient operations such as matrix-matrix multiplication and unpivoted QR factorizations, it is faster than competing rank-revealing factorization methods such as column-pivoted QR in most high-performance computational settings. In this article, optimized randUTV implementations are presented for both shared-memory and distributed-memory computing environments. For shared memory, randUTV is redesigned in terms of an algorithm-by-blocks that, together with a runtime task scheduler, eliminates bottlenecks from data synchronization points to achieve acceleration over the standard blocked algorithm based on a purely fork-join approach. The distributed-memory implementation is based on the ScaLAPACK library. The performance of our new codes compares favorably with competing factorizations available on both shared-memory and distributed-memory architectures. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
25. A novel direct resolution method for coupled systems in finite element analysis
- Author
-
Boutora, Youcef and Takorabet, Noureddine
- Published
- 2020
- Full Text
- View/download PDF
26. Generation of test matrices with specified eigenvalues using floating-point arithmetic.
- Author
-
Ozaki, Katsuhisa and Ogita, Takeshi
- Subjects
- *
FLOATING-point arithmetic , *NUMERICAL solutions for linear algebra , *EIGENVALUES , *MATRICES (Mathematics) , *MATRIX multiplications , *NORMAL forms (Mathematics) - Abstract
This paper concerns test matrices for numerical linear algebra using an error-free transformation of floating-point arithmetic. For specified eigenvalues given by a user, we propose methods of generating a matrix whose eigenvalues are exactly known based on, for example, Schur or Jordan normal form and a block diagonal form. It is also possible to produce a real matrix with specified complex eigenvalues. Such test matrices with exactly known eigenvalues are useful for numerical algorithms in checking the accuracy of computed results. In particular, exact errors of eigenvalues can be monitored. To generate test matrices, we first propose an error-free transformation for the product of three matrices YSX. We approximate S by S ′ to compute Y S ′ X without a rounding error. Next, the error-free transformation is applied to the generation of test matrices with exactly known eigenvalues. Note that the exactly known eigenvalues of the constructed matrix may differ from the anticipated given eigenvalues. Finally, numerical examples are introduced in checking the accuracy of numerical computations for symmetric and unsymmetric eigenvalue problems. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
27. Exploiting Multiple Levels of Parallelism in Sparse Matrix-Matrix Multiplication
- Author
-
Williams, Samuel [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)]
- Published
- 2016
- Full Text
- View/download PDF
28. GAUSSIAN BELIEF PROPAGATION SOLVERS FOR NONSYMMETRIC SYSTEMS OF LINEAR EQUATIONS.
- Author
-
FANASKOV, VLADIMIR
- Subjects
- *
NUMERICAL solutions for linear algebra , *NONSYMMETRIC matrices , *LINEAR systems , *MATRIX inversion , *LINEAR algebra , *SYMMETRIC matrices - Abstract
In this paper, we argue for the utility of deterministic inference in the classical problem of numerical linear algebra, that of solving a linear system. We show how the Gaussian belief propagation solver, known to work for symmetric matrices, can be modified to handle nonsymmetric matrices. Furthermore, we introduce a new algorithm for matrix inversion that corresponds to the generalized belief propagation derived from the cluster variation method. We relate these algorithms to LU and block LU decompositions and provide certain guarantees based on theorems from the theory of belief propagation. The proposed solvers can be accelerated within a framework of geometric multigrid. We benchmark the resulting multigrid method with a set of classical multigrid solvers based on incomplete LU, Chebyshev, and pattern Gauss-Seidel smoothers. The results indicate that Gaussian belief propagation can be used to construct competitive solvers, smoothers, and preconditioners for selected elliptic test problems. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
29. Theoretical construction and algorithms specifying determinants of ±1 matrices of the spectrum.
- Author
-
Sfyrakis, Chrysovalantis A.
- Subjects
- *
LINEAR algebra , *GAUSSIAN elimination , *NUMERICAL analysis , *MATHEMATICAL formulas , *SPECTRUM analysis - Abstract
In this paper, algorithms are presented constructing, for the Hadamard maximum determinant problem of dimension n with elements ± 1. This leads to the construction of efficient algorithms specifying the determinants of n × n matrices with elements ± 1. The notion of lexicographically ordered sequences of integers and matrices is presented and we describe algorithms creating all possible lexicographically ordered vectors of dimension. With the ordered vectors, we create all the alternative determinants and search for their values, so the ones we meet first are the lexically smaller ones. The implementation of the algorithms is introduced, both Exhaustive Search, Brute force Algorithm and Partial Search and we compare these algorithms according to their speed and efficiency. Furthermore, we present some theoretical constructions attaining the evaluation of the spectrum of matrices with elements ± 1. Summarizing in this article, we develop theoretical techniques on which we can rely and build faster algorithms to find the determinants of ± 1 ± 1. matrices of the spectrum. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
30. Planetary Normal Mode Computation: Parallel Algorithms, Performance, and Reproducibility.
- Author
-
Shi, Jia, Li, Ruipeng, Xi, Yuanzhe, Saad, Yousef, and de Hoop, Maarten V.
- Subjects
- *
PARALLEL algorithms , *NUMERICAL solutions for linear algebra , *FINITE element method , *PLANETARY interiors , *SCHOOL contests , *MATHEMATICS conferences - Abstract
This article is an extension of work entitled “Computing planetary interior normal modes with a highly parallel polynomial filtering eigensolver.” by Shi et al., originally presented at the SC18 conference. A highly parallel polynomial filtered eigensolver was developed and exploited to calculate the planetary normal modes. The proposed method is ideally suited for computing interior eigenpairs for large-scale eigenvalue problems as it greatly enhances memory and computational efficiency. In this article, the second-order finite element method is used to further improve the accuracy as only the first-order finite element method was deployed in the previous work. The parallel algorithm, its parallel performance up to 20k processors, and the great computational accuracy are illustrated. The reproducibility of the previous work was successfully performed on the Student Cluster Competition at the SC19 conference by several participant teams using a completely different Mars-model dataset on different clusters. Both weak and strong scaling performances of the reproducibility by the participant teams were impressive and encouraging. The analysis and reflection of their results are demonstrated and future direction is discussed. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
31. Batched matrix computations on hardware accelerators based on GPUs
- Author
-
Dongarra, Jack [Univ. of Tennessee, Knoxville, TN (United States); Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Univ. of Manchester (United Kingdom)]
- Published
- 2015
- Full Text
- View/download PDF
32. Numerical linear approximation involving radial basis functions
- Author
-
Zhu, Shengxin and Wathen, Andrew J.
- Subjects
518 ,Numerical analysis ,radial basis functions ,numerical linear algebra - Abstract
This thesis aims to acquire, deepen and promote understanding of computing techniques for high dimensional scattered data approximation with radial basis functions. The main contributions of this thesis include sufficient conditions for the sovability of compactly supported radial basis functions with different shapes, near points preconditioning techniques for high dimensional interpolation systems with compactly supported radial basis functions, a heterogeneous hierarchical radial basis function interpolation scheme, which allows compactly supported radial basis functions of different shapes at the same level, an O(N) algorithm for constructing hierarchical scattered data set andan O(N) algorithm for sparse kernel summation on Cartesian grids. Besides the main contributions, we also investigate the eigenvalue distribution of interpolation matrices related to radial basis functions, and propose a concept of smoothness matching. We look at the problem from different perspectives, giving a systematic and concise description of other relevant theoretical results and numerical techniques. These results are interesting in themselves and become more interesting when placed in the context of the bigger picture. Finally, we solve several real-world problems. Presented applications include 3D implicit surface reconstruction, terrain modelling, high dimensional meteorological data approximation on the earth and scattered spatial environmental data approximation.
- Published
- 2014
33. Sensitivity model based PID controller for various high-order processes
- Author
-
Ghosh, Sreya and Pan, Somnath
- Published
- 2019
- Full Text
- View/download PDF
34. Generic pulse width modulation model, based on generalized inverses and applied to voltage source inverters
- Author
-
Paul-Etienne, Vidal, Cailhol, Simon, Rotella, Frédéric, and Fadel, Maurice
- Published
- 2019
- Full Text
- View/download PDF
35. Discrete differential operators for periodic and two-periodic time functions
- Author
-
Sobczyk, Tadeusz, Radzik, Michał, and Radwan-Pragłowska, Natalia
- Published
- 2019
- Full Text
- View/download PDF
36. Joint direct and transposed sparse matrix‐vector multiplication for multithreaded CPUs.
- Author
-
Kozický, Claudio and Šimeček, Ivan
- Subjects
MULTIPLICATION ,SPARSE matrices ,NUMERICAL solutions for linear algebra ,LANCZOS method ,SYMMETRIC matrices ,PARALLEL algorithms - Abstract
Repeatedly performing sparse matrix‐vector multiplication (SpMV) followed by transposed sparse matrix‐vector multiplication (SpMTV) with the same matrix is a part of several algorithms, for example, the Lanczos biorthogonalization algorithm and the biconjugate gradient method. Such algorithms can benefit from combining parallel SpMV and SpMTV into a single operation we call joint direct and transposed sparse matrix‐vector multiplication (SpMMTV). In this article, we present a parallel SpMMTV algorithm for shared‐memory CPUs. The algorithm uses a sparse matrix format that divides the stored matrix into sparse matrix blocks and compresses the row and column indices of the matrix. This sparse matrix format can be also used for SpMV, SpMTV, and similar sparse matrix‐vector operations. We expand upon existing research by suggesting new variants of the parallel SpMMTV algorithm and by extending the algorithm to efficiently support symmetric matrices. We compare the performance of the presented parallel SpMMTV algorithm with alternative approaches, which use state‐of‐the‐art sparse matrix formats and libraries, using sparse matrices from real‐world applications. The performance results indicate that the median performance of our proposed parallel SpMMTV algorithm is up to 45% higher than of the alternative approaches. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
37. Fast iterative solvers for PDE-constrained optimization problems
- Author
-
Pearson, John W. and Wathen, Andrew
- Subjects
518 ,Mathematics ,Numerical analysis ,Calculus of variations and optimal control ,Partial differential equations ,numerical linear algebra ,preconditioning ,PDE-constrained optimization ,saddle point systems - Abstract
In this thesis, we develop preconditioned iterative methods for the solution of matrix systems arising from PDE-constrained optimization problems. In order to do this, we exploit saddle point theory, as this is the form of the matrix systems we wish to solve. We utilize well-known results on saddle point systems to motivate preconditioners based on effective approximations of the (1,1)-block and Schur complement of the matrices involved. These preconditioners are used in conjunction with suitable iterative solvers, which include MINRES, non-standard Conjugate Gradients, GMRES and BiCG. The solvers we use are selected based on the particular problem and preconditioning strategy employed. We consider the numerical solution of a range of PDE-constrained optimization problems, namely the distributed control, Neumann boundary control and subdomain control of Poisson's equation, convection-diffusion control, Stokes and Navier-Stokes control, the optimal control of the heat equation, and the optimal control of reaction-diffusion problems arising in chemical processes. Each of these problems has a special structure which we make use of when developing our preconditioners, and specific techniques and approximations are required for each problem. In each case, we motivate and derive our preconditioners, obtain eigenvalue bounds for the preconditioners where relevant, and demonstrate the effectiveness of our strategies through numerical experiments. The goal throughout this work is for our iterative solvers to be feasible and reliable, but also robust with respect to the parameters involved in the problems we consider.
- Published
- 2013
38. BLOCK LOW-RANK MATRICES WITH SHARED BASES: POTENTIAL AND LIMITATIONS OF THE BLR ² FORMAT.
- Author
-
ASHCRAFT, CLEVE, BUTTARI, ALFREDO, and MARYS, THEO
- Subjects
- *
LOW-rank matrices , *SPARSE matrices , *NUMERICAL solutions for linear algebra , *NUMERICAL analysis , *COST analysis - Abstract
We investigate a special class of data sparse rank-structured matrices that combine a flat block low-rank (BLR) partitioning with the use of shared (called nested in the hierarchical case) bases. This format is to H 2 matrices what BLR is to H matrices: we therefore call it the BLR² matrix format. We present algorithms for the construction and LU factorization of BLR² matrices,and perform their cost analysis---both asymptotically and for a fixed problem size. With weak admissibility, BLR² matrices reduce to block separable matrices (the flat version of HBS/HSS). Our analysis and numerical experiments reveal some limitations of BLR² matrices with weak admissibility,which we propose to overcome with two approaches: strong admissibility, and the use of multiple shared bases per row and column. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
39. BUTTERFLY FACTORIZATION VIA RANDOMIZED MATRIX-VECTOR MULTIPLICATIONS.
- Author
-
YANG LIU, XIN XING, HAN GUO, MICHIELSSEN, ERIC, GHYSELS, PIETER, and XIAOYE SHERRY LI
- Subjects
- *
FACTORIZATION , *MULTIPLICATION , *PARALLEL algorithms , *BUTTERFLIES , *NUMERICAL solutions for linear algebra - Abstract
This paper presents an adaptive randomized algorithm for computing the butterfly factorization of an m x n matrix with m ≈ n provided that both the matrix and its transpose can be rapidly applied to arbitrary vectors. The resulting factorization is composed of O(log n) sparse factors, each containing O(n) nonzero entries. The factorization can be attained using O(n3/2 log n) computation and O(n log n) memory resources. The proposed algorithm can be implemented in parallel and can apply to matrices with strong or weak admissibilit y conditions arising from surface integral equation solvers as well as multi-frontal-based finite-difference, finite-element, or finite-volume solvers. A distributed-memory parallel implementation of the algorithm demonstrates excellent scaling behavior. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
40. Sparser Johnson-Lindenstrauss Transforms.
- Author
-
KANE, DANIEL M. and NELSON, JELANI
- Subjects
FRACTIONS ,MATRICES (Mathematics) ,PROBABILITY theory ,MATHEMATICS ,BAYESIAN analysis - Abstract
We give two different and simple constructions for dimensionality reduction in l
2 via linear mappings that are sparse: only an (ε)-fraction of entries in each column of our embedding matrices are non-zero to achieve distortion 1+ε with high probability, while still achieving the asymptotically optimal number of rows. These are the first constructions to provide subconstant sparsity for all values of parameters, improving upon previous works of Achlioptas [2003] and Dasgupta et al. [2010]. Such distributions can be used to speed up applications where l2 dimensionality reduction is used. [ABSTRACT FROM AUTHOR]- Published
- 2014
- Full Text
- View/download PDF
41. STOCHASTIC ROUNDING AND ITS PROBABILISTIC BACKWARD ERROR ANALYSIS.
- Author
-
CONNOLLY, MICHAEL P., HIGHAM, NICHOLAS J., and MARY, THEO
- Subjects
- *
DEEP learning , *RANDOM variables , *LINEAR algebra , *NUMERICAL solutions for linear algebra , *PROBABILISTIC number theory , *INDEPENDENT variables - Abstract
Stochastic rounding rounds a real number to the next larger or smaller floating-point number with probabilities 1 minus the relative distances to those numbers. It is gaining attention in deep learning because it can increase the success of low precision computations. We compare basic properties of stochastic rounding with those for round to nearest, finding properties in common as well as significant differences. We prove that for stochastic rounding the rounding errors are mean independent random variables with zero mean. We derive a new version of our probabilistic error analysis theorem from [N. J. Higham and T. Mary, SIAM J. Sci. Comput., 41 (2019), pp. A2815-- A2835], weakening the assumption of independence of the random variables to mean independence. These results imply that for a wide range of linear algebra computations the backward error for stochastic rounding is unconditionally bounded by a multiple of √nu to first order, with a certain probability, where n is the problem size and u is the unit roundoff. This is the first scenario where the rule of thumb that one can replace nu by √nu in a rounding error bound has been shown to hold without any additional assumptions on the rounding errors. We also explain how stochastic rounding avoids the phenomenon of stagnation in sums, whereby small addends are obliterated by round to nearest when they are too small relative to the sum. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
42. Numerical Algorithm in Machine Learning and Data Analysis
- Author
-
Li, Bo
- Subjects
Applied mathematics ,Numerical linear algebra ,Scientific computing - Abstract
In this thesis, we present novel numerical algorithms in machine learning and data analysis. This thesis consists of two chapters. Each chapter is self-contained.In chapter 1, we present novel methods for performing k-fold cross validation for ridge regression. Ridge regression is a very broadly used method for reduce overfitting in linear regression. And it is used in many aspects of data analysis. One very important question for ridge regression is to find the best regularization hyperparameter $\lambda$. However, this can be a time consuming procedure. Here we present an efficient yet numerical stable algorithm for computing the relative error of ridge regression across different $\lambda$'s. Besides that, we provide an novel algorithm for finding the best hyper parameter $\lambda$ without computing all the relative errors.In chapter 2, we present algorithms for efficiently finding low rank approximation of a large sparse matrix with improved Lanczos Algorithm. Low rank approximation is one of the most important techniques in data analysis. In particular, people do principal component analysis based on that. We tackle this problem by finding the truncated SVD of the matrix using our improved version of Lanczos matrix. Our method guarantees that we only perform restart of the algorithm when necessary. Also, we prove a novel scheme for doing reorthogonalization during the Lanczos iteration. Besides that, we provide new stopping criteria for Lanczos algorithm directly based on the quality of the low rank approximation.
- Published
- 2021
43. Iterative and doubling algorithms for Riccati‐type matrix equations: A comparative introduction.
- Author
-
Poloni, Federico
- Subjects
- *
ALGEBRAIC equations , *NONLINEAR equations , *EQUATIONS , *RICCATI equation , *QUADRATIC equations , *ALGORITHMS - Abstract
We review a family of algorithms for Lyapunov‐ and Riccati‐type equations which are all related to each other by the idea of doubling: they construct the iterate Qk=X2k of another naturally‐arising fixed‐point iteration (Xh) via a sort of repeated squaring. The equations we consider are Stein equations X − A∗X A = Q, Lyapunov equations A∗X + X A + Q = 0, discrete‐time algebraic Riccati equations X = Q + A∗X(I + G X)−1A, continuous‐time algebraic Riccati equations Q + A∗X + X A − X G X = 0, palindromic quadratic matrix equations A + Q Y + A∗Y2 = 0, and nonlinear matrix equations X + A∗X−1A = Q. We draw comparisons among these algorithms, highlight the connections between them and to other algorithms such as subspace iteration, and discuss open issues in their theory. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
44. Enhancement of the Kaczmarz algorithm with projection adjustment.
- Author
-
Lin, Chuan, Herman, Gabor T., and Zibetti, Marcelo V. W.
- Subjects
- *
ALGORITHMS , *BEHAVIORAL assessment , *LINEAR equations , *NUMERICAL solutions for linear algebra , *LINEAR systems - Abstract
Projection adjustment is a technique that improves the rate of convergence, as measured by the number of iterations needed to achieve a given level of performance, of the Kaczmarz algorithm (KA) for iteratively solving a system of consistent linear equations, however at the cost of requiring additional time per iteration and increased storage. This hinders the applicability of the previously published Kaczmarz algorithm with projection adjustment (KAPA) to large-scale problems. An enhancement EKAPA of KAPA that uses projection adjustment only for a small subset of the equations is proposed for significantly reducing the time and storage requirements. An analysis of the behavior of EKAPA is provided. An illustration is given to show that EKAPA using a small subset of the equations for projection adjustment can achieve a speed-up over KA similar to that of KAPA in terms of the number of iterations, but requires much less computer time and storage; hence, it is more suitable for large-scale problems. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
45. A literature survey of matrix methods for data science.
- Author
-
Stoll, Martin
- Subjects
- *
NUMERICAL solutions for linear algebra , *DATA science , *LABOR discipline , *DEEP learning , *MATRIX functions - Abstract
Efficient numerical linear algebra is a core ingredient in many applications across almost all scientific and industrial disciplines. With this survey we want to illustrate that numerical linear algebra has played and is playing a crucial role in enabling and improving data science computations with many new developments being fueled by the availability of data and computing resources. We highlight the role of various different factorizations and the power of changing the representation of the data as well as discussing topics such as randomized algorithms, functions of matrices, and high‐dimensional problems. We briefly touch upon the role of techniques from numerical linear algebra used within deep learning. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
46. HARDNESS RESULTS FOR STRUCTURED LINEAR SYSTEMS.
- Author
-
KYNG, RASMUS and PENG ZHANG
- Subjects
- *
LINEAR systems , *NUMERICAL solutions for linear algebra , *LINEAR equations , *HARDNESS - Abstract
We show that if the nearly linear time solvers for Laplacian matrices and their generalizations can be extended to solve just slightly larger families of linear systems, then they can be used to quickly solve all systems of linear equations over the reals. This result can be viewed either positively or negatively: either nearly linear time algorithms can be developed for solving all systems of linear equations over the reals, or progress on the families that can be solved in nearly linear time will soon halt. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
47. First-Order Perturbation Theory for Eigenvalues and Eigenvectors.
- Author
-
Greenbaum, Anne, Ren-Cang Li, and Overton, Michael L.
- Subjects
- *
PERTURBATION theory , *NUMERICAL solutions for linear algebra , *EIGENVALUES , *IMPLICIT functions , *EIGENVECTORS , *INVARIANT subspaces - Abstract
We present first-order perturbation analysis of a simple eigenvalue and the corresponding right and left eigenvectors of a general square matrix, not assumed to be Hermitian or normal. The eigenvalue result is well known to a broad scientific community. The treatment of eigenvectors is more complicated, with a perturbation theory that is not so well known outside a community of specialists. We give two different proofs of the main eigenvector perturbation theorem. The first, a block-diagonalization technique inspired by the numerical linear algebra research community and based on the implicit function theorem, has apparently not appeared in the literature in this form. The second, based on complex function theory and on eigenprojectors, as is standard in analytic perturbation theory, is a simplified version of well-known results in the literature. The second derivation uses a convenient normalization of the right and left eigenvectors defined in terms of the associated eigenprojector, but although this dates back to the 1950s, it is rarely discussed in the literature. We then show how the eigenvector perturbation theory is easily extended to handle other normalizations that are often used in practice. We also explain how to verify the perturbation results computationally. We conclude with some remarks about difficulties introduced by multiple eigenvalues and give references to work on perturbation of invariant subspaces corresponding to multiple or clustered eigenvalues. Throughout the paper we give extensive bibliographic commentary and references for further reading. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
48. A CLASS OF FAST AND ACCURATE SUMMATION ALGORITHMS.
- Author
-
BLANCHARD, PIERRE, HIGHAM, NICHOLAS J., and MARY, THEO
- Subjects
- *
ADDITION (Mathematics) , *FLOATING-point arithmetic , *ERROR analysis in mathematics , *UBIQUITOUS computing , *ALGORITHMS , *NUMERICAL solutions for linear algebra - Abstract
The need to sum floating-point numbers is ubiquitous in scientific computing. Standard recursive summation of n summands, often implemented in a blocked form, has a backward error bound proportional to nu, where u is the unit roundoff. With the growing interest in low precision floating-point arithmetic and ever increasing n in applications, computed sums are more likely to have insufficient accuracy. We propose a class of summation algorithms called FABsum (for "fast and accurate block summation") that applies a fast summation algorithm (such as recursive summation) blockwise and then sums the partial sums using an accurate summation algorithm (such as compensated summation, or recursive summation in higher precision). We give a rounding error analysis to show that FABsum with a fixed block size b has a backward error bound (b+1)u+O(u²), which is independent of n to first order. Our computational experiments show that with a suitable choice of b (independent of n) FABsum can deliver substantially more accurate results than blocked recursive summation, with only a modest drop in performance. FABsum is especially attractive for low precisions, where it can provide correct digits for much larger n than recursive summation. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
49. Scalable Gaussian Process Computations Using Hierarchical Matrices.
- Author
-
Geoga, Christopher J., Anitescu, Mihai, and Stein, Michael L.
- Subjects
- *
GAUSSIAN processes , *COVARIANCE matrices , *FISHER information , *MAXIMUM likelihood statistics , *MATRICES (Mathematics) , *HIERARCHICAL Bayes model - Abstract
We present a kernel-independent method that applies hierarchical matrices to the problem of maximum likelihood estimation for Gaussian processes. The proposed approximation provides natural and scalable stochastic estimators for its gradient and Hessian, as well as the expected Fisher information matrix, that are computable in quasilinear O (n log 2 n) complexity for a large range of models. To accomplish this, we (i) choose a specific hierarchical approximation for covariance matrices that enables the computation of their exact derivatives and (ii) use a stabilized form of the Hutchinson stochastic trace estimator. Since both the observed and expected information matrices can be computed in quasilinear complexity, covariance matrices for maximum likelihood estimators (MLEs) can also be estimated efficiently. In this study, we demonstrate the scalability of the method, show how details of its implementation effect numerical accuracy and computational effort, and validate that the resulting MLEs and confidence intervals based on the inverse Fisher information matrix faithfully approach those obtained by the exact likelihood. for this article are available online. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
50. SMOOTHING SPLINES AND RANK STRUCTURED MATRICES: REVISITING THE SPLINE KERNEL.
- Author
-
ANDERSEN, MARTIN S. and TIANSHI CHEN
- Subjects
- *
SPLINES , *KRIGING , *MATRICES (Mathematics) , *NUMERICAL solutions for linear algebra - Abstract
We show that the spline kernel of order p is a so-called semiseparable function with semiseparability rank p. A consequence of this is that kernel matrices generated by the spline kernel are rank structured matrices that can be stored and factorized efficiently. We use this insight to derive new recursive algorithms with linear complexity in the number of knots for various kernel matrix computations. We also discuss applications of these algorithms, including smoothing spline regression, Gaussian process regression, and some related hyperparameter estimation problems. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.