Engineering and Scientific Subroutine Library for AIX Version 3 Release 3: Guide and Reference

Selecting an ESSL Subroutine

Your choice of which ESSL subroutine to use is based mainly on the functional needs of your program. However, you have a choice of several variations of many of the subroutines. In addition, there are instances where certain subroutines cannot be used. This section describes these variations and limitations. See the answers to each question below that applies to you.

Which ESSL Library Do You Want to Use?

ESSL provides two run-time libraries:

The ESSL SMP Library provides thread-safe versions of the ESSL subroutines for use on all SMP processors. In addition, a subset of these subroutines are also multithreaded versions; that is, they support the shared memory parallel processing programming model. For a list of the multithreaded subroutines in the ESSL SMP Library, see Table 21.
The ESSL Serial Library provides thread-safe versions of the ESSL subroutines for use on all processors. You may choose to use this library if you decide to develop your own multithreaded programs that call the thread-safe ESSL subroutines.
The number of threads you choose to use depends on the problem size, the specific subroutine being called, and the number of physical processors you are running on. To achieve optimal performance, experimentation is necessary; however, picking the number of threads equal to the number of online processors generally provides good performance in most cases. In a few cases, performance may increase if you choose the number of threads to be less than the number of online processors. |For more information about thread concepts, see AIX General Programming Concepts: Writing and Debugging |Programs.

The ESSL SERIAL Library and the ESSL SMP Library support both 32-bit environment and 64-bit environment applications. For details see Chapter 4, Coding Your Program and Chapter 5, Processing Your Program.

Table 21. Multithreaded ESSL SMP Subroutines

Subroutine Names
Vector-Scalar Linear Algebra Subprograms: SASUM, DASUM, SCASUM, DZASUM SAXPY, DAXPY, CAXPY, ZAXPY SCOPY, DCOPY, CCOPY, ZCOPY SDOT, DDOT, CDOTU, ZDOTU, CDOTC, ZDOTC SNDOT, DNDOT SNORM2, DNORM2, CNORM2, ZNORM2 SROT, DROT, CROT, ZROT, CSROT, ZDROT SSCAL, DSCAL, CSCAL, ZSCAL, CSSCAL, ZDSCAL SSWAP, DSWAP, CSWAP, ZSWAP SVEA, DVEA, CVEA, ZVEA SVES, DVES, CVES, ZVES SVEM, DVEM, CVEM, ZVEM SYAX, DYAX, CYAX, ZYAX, CSYAX, ZDYAX SZAXPY, DZAXPY, CZAXPY, ZZAXPY
Matrix-Vector Linear Algebra Subprograms: SGEMV, DGEMV, CGEMV, ZGEMV SGER, DGER, CGERU, ZGERU, CGERC, ZGERC SSPMV, DSPMV, CHPMV, ZHPMV SSYMV, DSYMV, CHEMV, ZHEMV SSPR, DSPR, CHPR, ZHPR SSYR, DSYR, CHER, ZHER SSPR2, DSPR2, CHPR2, ZHPR2 SSYR2, DSYR2, CHER2, ZHER2 SGBMV^¢, DGBMV^¢ CGBMV^¢, ZGBMV^¢ SSBMV^¢, DSBMV^¢ CHBMV^¢, ZHBMV^¢ STRMV, DTRMV, CTRMV, ZTRMV STPMV, DTPMV, CTPMV, ZTPMV STBMV^¢, DTBMV^¢ CTBMV^¢, ZTBMV^¢
Matrix Operations: SGEADD, DGEADD, CGEADD, ZGEADD SGESUB, DGESUB, CGESUB, ZGESUB SGEMUL, DGEMUL, CGEMUL, ZGEMUL SGEMM, DGEMM, CGEMM, ZGEMM SSYMM, DSYMM, CSYMM, ZSYMM, CHEMM, ZHEMM STRMM, DTRMM, CTRMM, ZTRMM SSYRK, DSYRK, CSYRK, ZSYRK, CHERK, ZHERK SSYR2K, DSYR2K, CSYR2K, ZSYR2K, CHER2K, ZHER2K SGETMI, DGETMI, CGETMI, ZGETMI SGETMO, DGETMO, CGETMO, ZGETMO
Dense Linear Algebraic Equations: SGEF, DGEF, CGEF, ZGEF SGETRF, DGETRF, CGETRF, ZGETRF SPPF, DPPF, DPOF, DPOTRF SPPFCD, DPPFCD, DPOFCD* SPPICD, DPPICD, DPOICD, DPOTRI STRSV, DTRSV, CTRSV, ZTRSV STPSV, DTPSV, CTPSV, ZTPSV STRSM, DTRSM, CTRSM, ZTRSM STRI, DTRI, STRTRI, DTRTRI
Sparse Linear Algebraic Equations: DSRIS^&
Linear Least Squares: DGEQRF
Fourier Transforms: SCFT, DCFT SRCFT, DRCFT SCRFT, DCRFT SCFT2, DCFT2 SRCFT2, DRCFT2 SCRFT2, DCRFT2 SCFT3, DCFT3 SRCFT3, DRCFT3 DCRFT3, DCRFT3
Convolution and Correlation: SCOND, SCORD SDCON, SDCOR, DDCON, DDCOR
Many of the dense linear algebraic equations and eigensystem analysis subroutines make one or more calls to the multithreaded versions of the matrix-vector linear algebra and matrix operation subroutines shown in this table. SCOSF, DCOSF, SSINF, and DSINF make one or more calls to the multithreaded versions of the Fourier Transform subroutines shown in this table. These subroutines benefit from the increased performance of the multithreaded versions of the ESSL SMP subroutines. Your performance may be improved by setting the Environment variables: export MALLOCMULTIHEAP=true export XLSMPOPTS="spins=0:yields=0". For additional information, see the AIX Performance Management Guide and the XLF Manuals. ^& DSRIS only uses multiple threads when `IPARM(4)` = 1 or 2. ^¢ The Level 2 Banded BLAS use multiple threads only when the bandwidth is sufficiently large. * Multiple threads are used for the factor or inverse computation.

What Type of Data Are You Processing in Your Program?

The version of the ESSL subroutine you select should agree with the data you are using. ESSL provides a short- and long-precision version of most of its subroutines processing short- and long-precision data, respectively. In a few cases, it also provides an integer version processing integer data or returning just integer data. The subroutine names are distinguished by a one- or two-letter prefix based on the following letters:

S for short-precision real

D for long-precision real

C for short-precision complex

Z for long-precision complex

I for integer

The precision of your data affects the accuracy of your results. This is discussed in Getting the Best Accuracy. For a description of these data types, see How Do You Set Up Your Scalar Data?.

How Is Your Data Structured? And What Storage Technique Are You Using?

Some subroutines process specific data structures, such as sparse vectors and matrices or dense and banded matrices. In addition, these data structures can be stored using various storage techniques. You should select the proper subroutine on the basis of the type of data structure you have and the storage technique you want to use. If possible, you should use a storage technique that conserves storage and potentially improves performance. For more about storage techniques, see Setting Up Your Data.

What about Performance and Accuracy?

ESSL provides variations among some of its subroutines. You should consider performance and accuracy when deciding which subroutine is the best to use. Study the "Function" section in each subroutine description. It helps you understand exactly what each subroutine does, and helps you determine which subroutine is best for you. For example, some subroutines perform multiple computations of a certain type. This might give you better performance than a subroutine that does each computation individually. In other cases, one subroutine may do scaling while another does not. If scaling is not necessary for your data, you get better performance by using the subroutine without scaling.

[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]