Engineering and Scientific Subroutine Library for AIX Version 3 Release 3: Guide and Reference

DSDCG--Sparse Positive Definite or Negative Definite Symmetric Matrix Iterative Solve Using Compressed-Diagonal Storage Mode

This subroutine solves a symmetric, positive definite or negative definite linear system, using the two-term conjugate gradient method, with or without preconditioning by an incomplete Cholesky factorization, for a sparse matrix stored in compressed-diagonal storage mode. Matrix A and vectors x and b are used:

Ax = b

where A, x, and b contain long-precision real numbers.

Syntax

Fortran	CALL DSDCG (`iopt`, `m`, `nd`, `ad`, `lda`, `la`, `b`, `x`, `iparm`, `rparm`, `aux1`, `naux1`, `aux2`, `naux2`)
C and C++	dsdcg (`iopt`, `m`, `nd`, `ad`, `lda`, `la`, `b`, `x`, `iparm`, `rparm`, `aux1`, `naux1`, `aux2`, `naux2`);
PL/I	CALL DSDCG (`iopt`, `m`, `nd`, `ad`, `lda`, `la`, `b`, `x`, `iparm`, `rparm`, `aux1`, `naux1`, `aux2`, `naux2`);

On Entry

iopt

indicates the type of storage used, where:

If iopt = 0, all the nonzero diagonals of the sparse matrix are stored in compressed-diagonal storage mode.

If iopt = 1, the sparse matrix, stored in compressed-diagonal storage mode, is symmetric. Only the main diagonal and one of each pair of identical diagonals are stored in array AD.

Specified as: a fullword integer; iopt = 0 or 1.

m

is the order of the linear system Ax = b and the number of rows in sparse matrix A. Specified as: a fullword integer; m >= 0.

nd

is the number of nonzero diagonals stored in the columns of array AD, the number of columns in the array AD, and the number of elements in array LA. Specified as: a fullword integer; it must have the following value, where:

If m > 0, then nd > 0.

If m = 0, then nd >= 0.

ad

is the array, referred to as AD, containing the values of the nonzero elements of the sparse matrix stored in compressed-diagonal storage mode. If iopt = 1, the main diagonal and one of each pair of identical diagonals is stored in this array.

Specified as: an lda by (at least) nd array, containing long-precision real numbers.

lda

is the leading dimension of the array specified for ad. Specified as: a fullword integer; lda > 0 and lda >= m.

la

is the array, referred to as LA, containing the diagonal numbers k for the diagonals stored in each corresponding column in array AD. For an explanation of how diagonal numbers are assigned, see Compressed-Diagonal Storage Mode.

Specified as: a one-dimensional array of (at least) length nd, containing fullword integers, where 1-m <= (elements of LA) <= m-1.

b

is the vector b of length m, containing the right-hand side of the matrix problem. Specified as: a one-dimensional array of (at least) length m, containing long-precision real numbers.

x

is the vector x of length m, containing your initial guess of the solution of the linear system. Specified as: a one-dimensional array of (at least) length m, containing long-precision real numbers. The elements can have any value, and if no guess is available, the value can be zero.

iparm

is an array of parameters, IPARM(i), where:

IPARM(1) controls the number of iterations.
If IPARM(1) > 0, IPARM(1) is the maximum number of iterations allowed.
If IPARM(1) = 0, the following default values are used:

IPARM(1) = 300
IPARM(2) = 1
IPARM(3) = 0
RPARM(1) = 10^-6
IPARM(2) is the flag used to select the stopping criterion.
If IPARM(2) = 0, the conjugate gradient iterative procedure is stopped when:

||r||₂ / ||x||₂ < epsilon

where r = b-Ax is the residual and epsilon is the desired relative accuracy. epsilon is stored in RPARM(1).
If IPARM(2) = 1, the conjugate gradient iterative procedure is stopped when:

||r||₂ / lambda||x||₂ < epsilon

where lambda is an estimate to the minimum eigenvalue of the iteration matrix. lambda is computed adaptively by this program and, on output, is stored in RPARM(2).
If IPARM(2) = 2, the conjugate gradient iterative procedure is stopped when:

||r||₂ / lambda||x||₂ < epsilon

where lambda is a predetermined estimate to the minimum eigenvalue of the iteration matrix. This eigenvalue estimate, on input, is stored in RPARM(2) and may be obtained by an earlier call to this subroutine with the same matrix.
IPARM(3) is the flag that determines whether the system is to be solved using the conjugate gradient method, preconditioned by an incomplete Cholesky factorization with no fill-in.
If IPARM(3) = 0, the system is not preconditioned.
If IPARM(3) = 10, the system is preconditioned by an incomplete Cholesky factorization.
If IPARM(3) = -10, the system is preconditioned by an incomplete Cholesky factorization, where the factorization matrix was computed in an earlier call to this subroutine and is stored in aux2.
IPARM(4), see On Return.

Specified as: an array of (at least) length 4, containing fullword integers, where:

IPARM(1) = 0

IPARM(2) = 0, 1, or 2

IPARM(3) = 0, 10, or -10

rparm

is an array of parameters, RPARM(i), where epsilon is stored in RPARM(1), and lambda is stored in RPARM(2).

RPARM(1) > 0, is the relative accuracy epsilon used in the stopping criterion.

RPARM(2) > 0, is the estimate of the smallest eigenvalue, lambda, of the iteration matrix. It is only used when IPARM(2) = 2.

RPARM(3), see On Return.

Specified as: a one-dimensional array of (at least) length 3, containing long-precision real numbers.

aux1

has the following meaning:

If naux1 = 0 and error 2015 is unrecoverable, aux1 is ignored.

Otherwise, it is a storage work area used by this subroutine, which is available for use by the calling program between calls to this subroutine. Its size is specified by naux1.

Specified as: an area of storage, containing long-precision real numbers.

naux1

is the size of the work area specified by aux1--that is, the number of elements in aux1.

Specified as: a fullword integer, where:

If naux = 0 and error 2015 is unrecoverable, DSDCG dynamically allocates the work area used by this subroutine. The work area is deallocated before control is returned to the calling program.

Otherwise, it must have at least the following value, where:

If IPARM(2) = 0 or 2, use naux1 >= 3m.

If IPARM(2) = 1 and IPARM(1) <> 0, use naux1 >= 3m+2(IPARM(1)).

If IPARM(2) = 1 and IPARM(1) = 0, use naux1 >= 3m+600.

aux2

is the storage work area used by this subroutine. If IPARM(3) = -10, aux2 must contain the incomplete Cholesky factorization of matrix A, computed in an earlier call to DSDCG. Its size is specified by naux2. Specified as: an area of storage, containing long-precision real numbers.

naux2

is the size of the work area specified by aux2--that is, the number of elements in aux2. Specified as: a fullword integer. When IPARM(3) = 10 or -10, naux2 must have at least the following value, where:

If iopt = 0, use naux2 >= m(1.5nd+2)1.5+2(m+6).

If iopt = 1, use naux2 >= m(3nd+2)+8.

On Return

x

is the vector x of length m, containing the solution of the system Ax = b. Returned as: a one-dimensional array, containing long-precision real numbers.

iparm

As an array of parameters, IPARM(i), where:

IPARM(1) is unchanged.

IPARM(2) is unchanged.

IPARM(3) is unchanged.

IPARM(4) contains the number of iterations performed by this subroutine.

Returned as: a one-dimensional array of length 4, containing fullword integers.

rparm

is an array of parameters, RPARM(i), where:

RPARM(1) is unchanged.

RPARM(2) is unchanged if IPARM(2) = 0 or 2. If IPARM(2) = 1, RPARM(2) contains lambda, an estimate of the smallest eigenvalue of the iteration matrix.

RPARM(3) contains the estimate of the error of the solution. If the process converged, RPARM(3) <= epsilon.

Returned as: a one-dimensional array of length 3, containing long-precision real numbers; lambda > 0.

aux2

is the storage work area used by this subroutine.

If IPARM(3) = 10, aux2 contains the incomplete Cholesky factorization of matrix A.

If IPARM(3) = -10, aux2 is unchanged.

See Notes for additional information on aux2. Returned as: an area of storage, containing long-precision real numbers.

Notes

When IPARM(3) = -10, this subroutine uses the incomplete Cholesky factorization in aux2, computed in an earlier call to this subroutine. When IPARM(3) = 10, this subroutine computes the incomplete Cholesky factorization and stores it in aux2.
If you solve the same sparse linear system of equations several times with different right-hand sides using the preconditioned algorithm, specify IPARM(3) = 10 on the first invocation. The incomplete factorization is stored in aux2. You may save computing time on subsequent calls by setting IPARM(3) = -10. In this way, the algorithm reutilizes the incomplete factorization that was computed the first time. Therefore, you should not modify the contents of aux2 between calls.
Matrix A must have no common elements with vectors x and b; otherwise, results are unpredictable.
In the iterative solvers for sparse matrices, the relative accuracy epsilon (RPARM(1)) must be specified "reasonably" (10^-4 to 10^-8). The algorithm computes a sequence of approximate solution vectors x that converge to the solution. The iterative procedure is stopped when the norm of the residual is sufficiently small--that is, when:

||b-Ax||₂ / lambda||x||₂ < epsilon

where lambda is an estimate of the minimum eigenvalue of the iteration matrix, which is either estimated adaptively or given by the user. As a result, if you specify a larger epsilon, the algorithm takes fewer iterations to converge to a solution. If you specify a smaller epsilon, the algorithm requires more iterations and computer time, but converges to a more precise solution. If the value you specify is unreasonably small, the algorithm may fail to converge within the number of iterations it is allowed to perform.
For a description of how sparse matrices are stored in compressed-matrix storage mode, see Compressed-Matrix Storage Mode.
On output, array AD and vector b are not bitwise identical to what they were on input, because the matrix A and the right-hand side are scaled before starting the iterative process and are unscaled before returning control to the user. In addition, arrays AD and LA may be rearranged on output, but still contain a mathematically equivalent mapping of the elements in matrix A.
You have the option of having the minimum required value for naux dynamically returned to your program. For details, see Using Auxiliary Storage in ESSL.

Function

The sparse positive definite or negative definite linear system:

Ax = b

is solved, where:

A is a symmetric, positive definite or negative definite sparse matrix of order m, stored in compressed-diagonal storage mode in arrays AD and LA.

x is a vector of length m.

b is a vector of length m.

The system is solved using the two-term conjugate gradient method, with or without preconditioning by an incomplete Cholesky factorization. In both cases, the matrix is scaled by the square root of the diagonal.

See references [62] and [68]. [36].

Error Conditions

Resource Errors

Error 2015 is unrecoverable, naux1 = 0, and unable to allocate work area.

Computational Errors

The following errors, with their corresponding return codes, can occur in this subroutine. Where a value of i is indicated, it can be determined at run time by use of the ESSL error-handling facilities. To obtain this information, you must use ERRSET to change the number of allowable errors for that particular error code in the ESSL error option table; otherwise, the default value causes your program to terminate when the error occurs. For details, see What Can You Do about ESSL Computational Errors?.

For error 2110, return code 1 indicates that the subroutine exceeded IPARM(1) iterations without converging. Vector x contains the approximate solution computed at the last iteration.
For error 2111, return code 2 indicates that aux2 contains an incorrect factorization. The subroutine has been called with IPARM(3) = -10, and aux2 contains an incomplete factorization of the input matrix A that was computed by a previous call to the subroutine when IPARM(3) = 10. This error indicates that aux2 has been modified since the last call to the subroutine, or that the input matrix is not the same as the one that was factored. If the default action has been overridden, the subroutine can be called again with the same parameters, with the exception of IPARM(3) = 0 or 10.
For error 2109, return code 3 indicates that the inner product (y,Ay) is negative in the iterative procedure after iteration i. This should not occur, because the input matrix is assumed to be positive or negative definite. Vector x contains the results of the last iteration. The value i is identified in the computational error message.
For error 2108, return code 4 indicates that the matrix is not positive definite. AC is partially modified and does not represent the same matrix as on entry.

Input-Argument Errors

iopt <> 0 or 1
m < 0
lda < 1
lda < m
nd < 0
nd = 0 and m > 0
|lambda(i)| > m-1 for i = 1, nd
IPARM(1) < 0
IPARM(2) <> 0, 1, or 2
IPARM(3) <> 0, 10, or -10
RPARM(1) < 0
RPARM(2) < 0
Error 2015 is recoverable or naux1<>0, and naux1 is too small--that is, less than the minimum required value. Return code 5 is returned if error 2015 is recoverable.
naux2 is too small--that is, less than the minimum required value. Return code 5 is returned if error 2015 is recoverable.

Example 1

This example finds the solution of the linear system Ax = b for sparse matrix A, which is stored in compressed-diagonal storage mode in arrays AD and LA. The system is solved using the two-term conjugate gradient method. In this example, IOPT = 0.. Matrix A is:

         *                                                      *
         |  2.0   0.0  -1.0   0.0   0.0   0.0   0.0   0.0   0.0 |
         |  0.0   2.0   0.0  -1.0   0.0   0.0   0.0   0.0   0.0 |
         | -1.0   0.0   2.0   0.0  -1.0   0.0   0.0   0.0   0.0 |
         |  0.0  -1.0   0.0   2.0   0.0  -1.0   0.0   0.0   0.0 |
         |  0.0   0.0  -1.0   0.0   2.0   0.0  -1.0   0.0   0.0 |
         |  0.0   0.0   0.0  -1.0   0.0   2.0   0.0  -1.0   0.0 |
         |  0.0   0.0   0.0   0.0  -1.0   0.0   2.0   0.0  -1.0 |
         |  0.0   0.0   0.0   0.0   0.0  -1.0   0.0   2.0   0.0 |
         |  0.0   0.0   0.0   0.0   0.0   0.0  -1.0   0.0   2.0 |
         *                                                      *

Call Statement and Input

           IOPT  M   ND  AD  LDA  LA   B   X   IPARM   RPARM   AUX1  NAUX1  AUX2  NAUX2
            |    |   |   |    |   |    |   |     |       |      |      |     |      |
CALL DSDCG( 0  , 9 , 3 , AD , 9 , LA , B , X , IPARM , RPARM , AUX1 , 283 , AUX2 ,  0  )

IPARM(1) =  20
IPARM(2) =  0
IPARM(3) =  0
RPARM(1) =  1.D-7

        *                 *
        | 2.0   0.0  -1.0 |
        | 2.0   0.0  -1.0 |
        | 2.0  -1.0  -1.0 |
        | 2.0  -1.0  -1.0 |
AD   =  | 2.0  -1.0  -1.0 |
        | 2.0  -1.0  -1.0 |
        | 2.0  -1.0  -1.0 |
        | 2.0  -1.0   0.0 |
        | 2.0  -1.0   0.0 |
        *                 *

LA       =  (0, -2, 2)
B        =  (1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0)
X        =  (0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)

Output

X        =  (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0)
IPARM(4) =  5
RPARM(2) =  0
RPARM(3) =  0.46D-16

Example 2

This example finds the solution of the linear system Ax = b for the same sparse matrix A as in Example 1, which is stored in compressed-diagonal storage mode in arrays AD and LA. The system is solved using the two-term conjugate gradient method. In this example, IOPT = 1, indicating that the matrix is symmetric, and only the main diagonal and one of each pair of identical diagonals are stored in array AD.

Call Statement and Input

           IOPT  M   ND  AD  LDA  LA   B   X   IPARM   RPARM   AUX1  NAUX1  AUX2  NAUX2
            |    |   |   |    |   |    |   |     |       |      |      |     |      |
CALL DSDCG( 1  , 9 , 2 , AD , 9 , LA , B , X , IPARM , RPARM , AUX1 , 283 , AUX2 , 80  )

IPARM(1) =  20
IPARM(2) =  0
IPARM(3) =  10
RPARM(1) =  1.D-7

        *           *
        | 2.0   0.0 |
        | 2.0   0.0 |
        | 2.0  -1.0 |
        | 2.0  -1.0 |
AD   =  | 2.0  -1.0 |
        | 2.0  -1.0 |
        | 2.0  -1.0 |
        | 2.0  -1.0 |
        | 2.0  -1.0 |
        *           *

LA       =  (0, -2)
B        =  (1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0)
X        =  (0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)

Output

X        =  (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0)
IPARM(4) =  1
RPARM(2) =  0
RPARM(3) =  0.89D-16

[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]