About project

From Algowiki
Jump to navigation Jump to search

AlgoWiki is an open encyclopedia of algorithms’ properties and features of their implementations on different hardware and software platforms from mobile to extreme scale, which allows for collaboration with the worldwide computing community on algorithm descriptions.


Computers evolve quickly, and there have been at least six generations of computing architecture over the last forty years that caused the need for radical changes in software. Vector computers, vector-parallel, massive parallel computers, shared-memory nodes, clusters of shared-memory computers, computers with accelerators\dots The computing community has survived this, but in each evolution, the software basically had to be rewritten from scratch as each new generation of machines required singling out new properties in the algorithms, and that was reflected in the software.

Alas, there is no reason to hope that the situation will change for the better in the future. Vendors are already considering various prospective architectures, featuring light and/or heavy computing cores, accelerators of various types, SIMD and data-flow processing concepts. In this situation, codes will yet again have to be rewritten in order to utilize the full potential of future computers. It's an endless process that - understandably - doesn't make software developers any happier.

However, the situation isn't quite hopeless. Indeed, new computing systems require a full review of the legacy code. But the algorithms themselves don't change; only the requirements that new computers present to the structure of algorithms and programs change. To support vector computing, data parallelism needs to be built into the innermost loops within a program. The key concern in ensuring the efficient usage of massive parallel computers is to find a representation of an algorithm whereby a large number of computing nodes can work independently from each other, minimizing data exchange. This is true for every generation of parallel computing systems: the new architecture requires taking a new look at the properties of existing algorithms, in order to find the most efficient way of implementing them.

There are two facts that matter in this situation: the algorithms themselves don't change much, and their properties do not depend on the computing system. This means that once algorithm's properties have been described in detail, this information can be used repeatedly for any computing platform - existing today or in the future.

The idea of a deep a priori analysis of properties of algorithms and their implementation formed the basis for the AlgoWiki project. The main purpose of the project is to present a description of the fundamental properties of various algorithms giving a complete understanding of both their theoretical potential and the particular aspects of their implementation for various classes of parallel computing systems.

The description of all algorithms in AlgoWiki consists of two parts. The first part describes algorithms and their properties. The second part is dedicated to describing particular aspects of their implementation on various computing platforms. This division is made intentionally, to highlight the machine-independent properties of algorithms which determine their potential and the quality of their implementations on parallel computing systems, and to describe them separately from a number of issues related to the subsequent stages of programming and executing the resulting programs.

AlgoWiki provides an exhaustive description of an algorithm. In addition to classical algorithm properties such as serial complexity, AlgoWiki also presents additional information, which together provides a complete description of the algorithm: its parallel complexity, parallel structure, determinacy, data locality, performance and scalability estimates, communication profiles for specific implementations, and many others.

In AlgoWiki, details matter. Classical basic algorithms must be supplemented with their important practical modifications. For example, AlgoWiki contains a description of a basic point-structured real version of Cholesky factorization for a dense symmetrical positive-definite matrix. In practice, modifications of the algorithm are just as important: a block-structured version, a version for a dense complex-Hermitian matrix (both point-structured and block-structured), versions of the algorithm for sparse matrices, etc. It is also important to consider the use of Cholesky factorization in iterative methods, etc.

Equally important are the details related to particular aspects of an algorithm's implementation on specific parallel computing platforms: multi-core processors, SMP, clusters, accelerators, using vector processing and so on. In many cases it is necessary to go one step lower, describing the implementation of an algorithm, for example, not just for a specific accelerator, but to single out relatively important individual cases, such as GPU and Xeon Phi. At the same time, when we provide data about execution time, performance and scalability, we are not only laying out some estimates of the possible implementation quality of a given algorithm on a specific computer, but also setting the foundation for comparative analysis of various computing platforms with regards to the algorithms provided in AlgoWiki.

The outcome of the AlgoWiki project is an open encyclopedia of algorithm properties and the particular aspects of their computer implementation. It was started with a focus on using Wiki technologies, enabling the entire computing community to collaborate on algorithm descriptions. Currently, the encyclopedia is actively being expanded by outside experts; a multi-lingual version is also in the works which will eventually become the main version.

This project is carried out with the financial support of the Russian Science Foundation, Agreement No 14-11-00190.