Parallel Engineering and Scientific Subroutine Library for AIX Version 2 Release 3: Guide and Reference

Overview of Parallel ESSL

Parallel ESSL is a scalable mathematical subroutine library that supports parallel processing applications on IBM RS/6000 SP Systems and clusters of |IBM pSeries; and IBM RS/6000 |workstations. Parallel ESSL supports the Single Program Multiple Data (SPMD) programming model using either the Message Passing Interface (MPI) signal handling library or the |MPI threads library. Parallel ESSL provides subroutines in six major areas of mathematical computations.

Parallel ESSL provides subroutines in the following computational areas:

Level 2 Parallel Basic Linear Algebra Subprograms (PBLAS)
Level 3 PBLAS
Linear Algebraic Equations
Eigensystem Analysis and Singular Value Analysis
Fourier Transforms
Random Number Generation

The subroutines run under the AIX operating system and can be called from application programs written in Fortran, C, and C++. On the SP, Parallel System Support Programs (PSSP) is also required.

|Parallel ESSL provides these run-time libraries: |

|The Parallel ESSL SMP Libraries are provided for use with the |MPI threads library. You may run single or multithreaded |applications on all types of nodes. However, you cannot simultaneously |call Parallel ESSL from multiple threads. Use these Parallel ESSL |libraries if you are using both PE MPI and LAPI. The SMP library is for |use on the POWER and PowerPC (for example, POWER3-II SMP Thin, Wide, or |High Nodes) SMP processors.
|The Parallel ESSL SMP Libraries support both 32-bit-environment and |64-bit-environment applications.
|The Parallel ESSL Serial Libraries are provided for use with |the MPI signal handling library on all types of nodes. These libraries |are tuned for the POWER, POWER3, POWER3-II, |POWER4, and PowerPC processors. |

For communication, Parallel ESSL includes the Basic Linear Algebra Communications Subprograms (BLACS), which use the Parallel Environment (PE) Message Passing Interface (MPI). Communications using the User Space (US) require either the SP Switch or SP Switch2. Communications using the Internet Protocol (IP) may use Ethernet, Token Ring, Fiber Distributed Data Interface (FDDI), SP Switch or SP Switch2. For computations, Parallel ESSL uses the ESSL for AIX subroutines.

To order the IBM Parallel ESSL for AIX, specify program number 5765-C41.

How Parallel ESSL Works under the Parallel Environment (PE)

Parallel ESSL uses PE for communication during parallel processing, supporting the SPMD programming model, running on the SP or workstation clusters. In other words, your application program must be using PE if you want to call Parallel ESSL subroutines.

The |IBM pSeries; and RS/6000 processors are called processor nodes. A parallel program, such as yours with calls to the Parallel ESSL subroutines, executes as a number of individual, but related, parallel tasks on a number of your system's processor nodes. The group of parallel tasks is called a partition. The parallel tasks of your partition can communicate to exchange data or synchronize execution.

Your SP may have an optional high-performance switch for communication. The switch increases the speed of communication between nodes. It supports a high volume of message passing with increased bandwidth and low latency. This helps your application program, as well as the Parallel ESSL subroutines, achieve maximum performance.

Parallel ESSL assumes that the application program is using the SPMD programming model, where the programs running the parallel tasks of your partition are identical. The tasks, however, work on different sets of data.

Coding Your Program

The application developer begins by creating a parallel program's source code, including calls to the Parallel ESSL subroutines. The application developer might create this program from scratch and then places calls to BLACS or MPI or MPL routines so that it can run as a number of parallel tasks. These calls enable the parallel processes of your partition to communicate data and coordinate their execution. As part of each parallel process, the Parallel ESSL subroutines also perform these types of functions.

Details on what other specific coding additions are required when using Parallel ESSL are given in Chapter 3, Coding and Running Your Program.

Distributing Your Data

Your global data structures (vectors, matrices, or sequences) must be distributed across your processes prior to calling the Parallel ESSL subroutines.

Because data is distributed for both input and output, no implicit bottleneck is created by an initial scatter or ending gather operation. Parallel ESSL works in true SPMD mode, where each process operates only on a portion of the data. Also, the input and output data may be too large to collectively reside on a single node; therefore, problems associated with the storage limitations of a single processor node are eased by performing the computation in actual SPMD fashion.

See Chapter 2, Distributing Your Data for details on distributing your data.

Running and Testing

After writing the parallel application program containing calls to the Parallel ESSL subroutines, the developer then begins a cycle of modification and testing. The application program is run using the Parallel Operating Environment (POE). The POE includes a number of compiler scripts, environment variables, and command-line flags, which may be used to set up your PE execution environment. (For example, before you execute a program, you need to set the size of your partition--the number of parallel tasks--by setting the appropriate environment variables or their command-line flags.) You can use all of these capabilities of POE with Parallel ESSL.

Tuning for Performance

Once the parallel program is debugged, you now want to tune the program for optimal performance. This is an important step of the process, because performance is the key reason for using the Parallel ESSL subroutines. To tune and analyze programs with calls to the Parallel ESSL subroutines, you may wish to use the tools provided by PE. For details, see the PE manuals listed in Parallel Environment.

Where to Find Information on PE

For further details on PE and its various capabilities, see the PE manuals listed in Parallel Environment. For more information about MPI, see references [38] and [46].

Accuracy of the Computations

Parallel ESSL provides accuracy comparable to libraries using equivalent algorithms with identical precision formats. |The data types operated on are ANSI/IEEE 64-bit binary |floating-point format and 32-bit integer. See the ANSI/IEEE Standard for Binary Floating-Point Arithmetic, ANSI/IEEE Standard 754-1985 for more detail.

The Fortran Language Interface to the Parallel ESSL Subroutines

The Parallel ESSL subroutines follow standard Fortran calling conventions. When Parallel ESSL subroutines are called from a program in a language other than Fortran, such as C or C++, the Fortran conventions must be used. This applies to all aspects of the interface, such as the linkage conventions and the data conventions. For example, array ordering must be consistent with Fortran array ordering techniques. Data and linkage conventions for each language are given in the ESSL Version 3 Guide and Reference.

[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]