Open Encyclopedia of Parallel Algorithmic Features

Welcome! Join us!

AlgoWiki is an open encyclopedia of algorithms’ properties and features of their implementations on different hardware and software platforms from mobile to extreme scale, which allows for collaboration with the worldwide computing community on algorithm descriptions.

AlgoWiki provides an exhaustive description of an algorithm. In addition to classical algorithm properties such as serial complexity, AlgoWiki also presents additional information, which together provides a complete description of the algorithm: its parallel complexity, parallel structure, determinacy, data locality, performance and scalability estimates, communication profiles for specific implementations, and many others.

1 Properties and structure of the algorithm

1.1 General description

The Cholesky decomposition algorithm was first proposed by Andre-Louis Cholesky (October 15, 1875 - August 31, 1918) at the end of the First World War shortly before he was killed in battle. He was a French military officer and mathematician. The idea of this algorithm was published in 1924 by his fellow officer and, later, was used by Banachiewicz in 1938 [7]. In the Russian mathematical literature, the Cholesky decomposition is also known as the square-root method [1-3] due to the square root operations used in this decomposition and not used in Gaussian elimination.

Originally, the Cholesky decomposition was used only for dense real symmetric positive definite matrices. At present, the application of this decomposition is much wider. For example, it can also be employed for the case of Hermitian matrices. In order to increase the computing performance, its block versions are often applied.

In the case of sparse matrices, the Cholesky decomposition is also widely used as the main stage of a direct method for solving linear systems. In order to reduce the memory requirements and the profile of the matrix, special reordering strategies are applied to minimize the number of arithmetic operations. A number of reordering strategies are used to identify the independent matrix blocks for parallel computing systems.

1.2 Mathematical description

Input data: a symmetric positive definite matrix [math]A[/math] whose elements are denoted by [math]a_{ij}[/math]).

Output data: the lower triangular matrix [math]L[/math] whose elements are denoted by [math]l_{ij}[/math]).

The Cholesky algorithm can be represented in the form

[math] \begin{align} l_{11} & = \sqrt{a_{11}}, \\ l_{j1} & = \frac{a_{j1}}{l_{11}}, \quad j \in [2, n], \\ l_{ii} & = \sqrt{a_{ii} - \sum_{p = 1}^{i - 1} l_{ip}^2}, \quad i \in [2, n], \\ l_{ji} & = \left (a_{ji} - \sum_{p = 1}^{i - 1} l_{ip} l_{jp} \right ) / l_{ii}, \quad i \in [2, n - 1], j \in [i + 1, n]. \end{align} [/math]

There exist block versions of this algorithm; however, here we consider only its “dot” version.

In a number of implementations, the division by the diagonal element [math]l_{ii}[/math] is made in the following two steps: the computation of [math]\frac{1}{l_{ii}}[/math] and, then, the multiplication of the result by the modified values of [math]a_{ji}[/math] . Here we do not consider this computational scheme, since this scheme has worse parallel characteristics than that given above.

Read more…

Useful links

Lomonosov Moscow State University
MSU Research computing center

Parallel.ru — Informational Analytical Center on supercomputers and parallel computing
International conference Russian Supercomputing Days

Today’s featured picture