IBM Books

Parallel Engineering and Scientific Subroutine Library for AIX Version 2 Release 3: Guide and Reference

PDPOTRF and PZPOTRF--Positive Definite Real Symmetric or Complex Hermitian Matrix Factorization

PDPOTRF uses Cholesky factorization to factor a positive definite real symmetric matrix A into one of the following forms:

A = LLT if uplo = 'L'.
A = UTU if uplo = 'U'.

PZPOTRF uses Cholesky factorization to factor a positive definite complex Hermitian matrix A into one of the following forms:

A = LLH if uplo = 'L'.
A = UHU if uplo = 'U'.

In the formulas above:

A represents the global positive definite real symmetric or complex Hermitian submatrix Aia:ia+n-1, ja:ja+n-1 to be factored.
L is a lower triangular matrix.
U is an upper triangular matrix.

To solve the system of equations with any number of right-hand sides, follow the call to these subroutines with one or more calls to PDPOTRS or PZPOTRS, respectively.

If n = 0, no computation is performed and the subroutine returns after doing some parameter checking. See references [16], [18], [22], [36], and [37].

Table 66. Data Types

A Subroutine
Long-precision real PDPOTRF
Long-precision complex PZPOTRF

Syntax

Fortran CALL PDPOTRF | PZPOTRF (uplo, n, a, ia, ja, desc_a, info)
C and C++ pdpotrf | pzpotrf (uplo, n, a, ia, ja, desc_a, info);

On Entry

uplo
indicates whether the upper or lower triangular part of the global real symmetric or complex Hermitian submatrix A is referenced, where:

If uplo = 'U', the upper triangular part is referenced.

If uplo = 'L', the lower triangular part is referenced.

Scope: global

Specified as: a single character; uplo = 'U' or 'L'.

n
is the number of rows and columns in submatrix A used in the computation.

Scope: global

Specified as: a fullword integer; n >= 0.

a
is the local part of the global real symmetric or complex Hermitian matrix A, used in the system of equations. This identifies the first element of the local array A. This subroutine computes the location of the first element of the local subarray used, based on ia, ja, desc_a, p, q, myrow, and mycol; therefore, the leading LOCp(ia+n-1) by LOCq(ja+n-1) part of the local array A must contain the local pieces of the leading ia+n-1 by ja+n-1 part of the global matrix, and:

Scope: local

Specified as: an LLD_A by (at least) LOCq(N_A) array, containing numbers of the data type indicated in Table 66. Details about the square block-cyclic data distribution of global matrix A are stored in desc_a.

ia
is the row index of the global matrix A, identifying the first row of the submatrix A.

Scope: global

Specified as: a fullword integer; 1 <= ia <= M_A and ia+n-1 <= M_A.

ja
is the column index of the global matrix A, identifying the first column of the submatrix A.

Scope: global

Specified as: a fullword integer; 1 <= ja <= N_A and ja+n-1 <= N_A.

desc_a
is the array descriptor for global matrix A, described in the following table:
desc_a Name Description Limits Scope
1 DTYPE_A Descriptor type DTYPE_A=1 Global
2 CTXT_A BLACS context Valid value, as returned by BLACS_GRIDINIT or BLACS_GRIDMAP Global
3 M_A Number of rows in the global matrix If n = 0:
M_A >= 0
Otherwise:
M_A >= 1
Global
4 N_A Number of columns in the global matrix If n = 0:
N_A >= 0
Otherwise:
N_A >= 1
Global
5 MB_A Row block size MB_A >= 1 Global
6 NB_A Column block size NB_A >= 1 Global
7 RSRC_A The process row of the p × q grid over which the first row of the global matrix is distributed 0 <= RSRC_A < p Global
8 CSRC_A The process column of the p × q grid over which the first column of the global matrix is distributed 0 <= CSRC_A < q Global
9 LLD_A The leading dimension of the local array LLD_A >= max(1,LOCp(M_A)) Local

Specified as: an array of (at least) length 9, containing fullword integers.

info
See On Return.

On Return

a
is the updated local part of the global matrix A, containing the results of the factorization.

Scope: local

Returned as: an LLD_A by (at least) LOCq(N_A) array, containing numbers of the data type indicated in Table 66.

info
has the following meaning:

If info = 0, global real symmetric or complex Hermitian submatrix A is positive definite, and the factorization completed normally.

If info > 0, the leading minor of order k of the global real symmetric or complex Hermitian submatrix A is not positive definite. info is set equal to k, where the leading minor was encountered at Aia+k-1, ja+k-1. The factorization is not completed. A is overwritten with the partial factors.

Scope: global

Returned as: a fullword integer; info >= 0.

Notes and Coding Rules
  1. In your C program, argument info must be passed by reference.
  2. This subroutine accepts lowercase letters for the uplo argument.
  3. On input to PZPOTRF, the imaginary parts of the diagonal elements of the complex Hermitian matrix A are assumed to be zero, so you do not have to set these values. On output, they are set to zero.
  4. The scalar data specified for input argument n must be the same for both PDPOTRF/PZPOTRF and PDPOTRS/PZPOTRS.
  5. The global submatrix A input to PDPOTRS/PZPOTRS must be the same as for the corresponding output argument for PDPOTRF/PZPOTRF; and thus, the scalar data specified for ia, ja, and the contents of desc_a must also be the same.
  6. The NUMROC utility subroutine can be used to determine the values of LOCp(M_) and LOCq(N_) used in the argument descriptions above. For details, see Determining the Number of Rows and Columns in Your Local Arrays and NUMROC--Compute the Number of Rows or Columns of a Block-Cyclically Distributed Matrix Contained in a Process.
  7. The way these subroutines handle nonpositive definiteness differs from ScaLAPACK. These subroutines use the info argument to provide information about the nonpositive definiteness of A, like ScaLAPACK, but also provides an error message.
  8. On both input and output, matrix A conforms to ScaLAPACK format.
  9. The global real symmetric or complex Hermitian matrix A must be distributed using a square block-cyclic distribution; that is, MB_A = NB_A.
  10. The global real symmetric or complex Hermitian matrix A must be aligned on a block row boundary; that is, ia-1 must be a multiple of MB_A.
  11. The block row offset of A must be equal to the block column offset of A; that is, mod(ia-1, MB_A) = mod(ja-1, NB_A).

Performance Considerations
  1. For suggested block sizes, see Coding Tips for Optimizing Parallel Performance.
  2. For optimal performance, you should use a square process grid to minimize the communication path length in both directions.
  3. For optimal performance, take the following items into consideration when choosing the NB_A (= MB_A) value:

Error Conditions

Computational Errors

Matrix A is not positive definite. For details, see the description of the info argument.

Resource Errors

Unable to allocate work space

Input-Argument and Miscellaneous Errors

Stage 1 

  1. DTYPE_A is invalid.

Stage 2 

  1. CTXT_A is invalid.

Stage 3 

  1. This subroutine was called from outside the process grid.

Stage 4 

  1. uplo <> 'U' or 'L'
  2. n < 0
  3. M_A < 0 and n = 0; M_A < 1 otherwise
  4. N_A < 0 and n = 0; N_A < 1 otherwise
  5. ia < 1
  6. ja < 1
  7. MB_A < 1
  8. NB_A < 1
  9. RSRC_A < 0 or RSRC_A >= p
  10. CSRC_A < 0 or CSRC_A >= q

Stage 5 

    If n <> 0:

  1. ia > M_A
  2. ja > N_A
  3. ia+n-1 > M_A
  4. ja+n-1 > N_A

    In all cases:

  5. MB_A <> NB_A
  6. mod(ia-1, MB_A) <> mod(ja-1, NB_A)
  7. mod(ia-1, MB_A) <> 0

Stage 6 

  1. LLD_A < max(1, LOCp(M_A))

    Each of the following global input arguments are checked to determine whether its value differs from the value specified on process P00:

  2. uplo differs.
  3. n differs.
  4. ia differs.
  5. ja differs.
  6. DTYPE_A differs.
  7. M_A differs.
  8. N_A differs.
  9. MB_A differs.
  10. NB_A differs.
  11. RSRC_A differs.
  12. CSRC_A differs.

Example 1

This example factors a 9 × 9 positive definite real symmetric matrix using a 2 × 2 process grid.

Call Statements and Input
ORDER = 'R'
NPROW = 2
NPCOL = 2
CALL BLACS_GET (0, 0, ICONTXT)
CALL BLACS_GRIDINIT(ICONTXT, ORDER, NPROW, NPCOL)
CALL BLACS_GRIDINFO(ICONTXT, NPROW, NPCOL, MYROW, MYCOL)
 
               UPLO   N    A  IA  JA    DESC_A   INFO
                |     |    |   |   |      |       |
CALL PDPOTRF(  'L'  , 9  , A , 1 , 1 ,  DESC_A , INFO )


Desc_A
DTYPE_ 1
CTXT_ icontxt(IOBG38)
M_ 9
N_ 9
MB_ 3
NB_ 3
RSRC_ 0
CSRC_ 0
LLD_ See below(EPSSL38)

Notes:

  1. icontxt is the output of the BLACS_GRIDINIT call.

  2. Each process should set the LLD_ as follows:
    LLD_A = MAX(1,NUMROC(M_A, MB_A, MYROW, RSRC_A, NPROW))
    

    In this example, LLD_A = 6 on P00 and P01, and LLD_A = 3 on P10 and P11.

Global real symmetric matrix A of order 9 with block size 3 × 3:

B,D          0                  1                  2
     *                                                      *
     |  1.0   .    .   |    .    .    .   |    .    .    .  |
 0   |  1.0  2.0   .   |    .    .    .   |    .    .    .  |
     |  1.0  2.0  3.0  |    .    .    .   |    .    .    .  |
     | ----------------|------------------|---------------- |
     |  1.0  2.0  3.0  |   4.0   .    .   |    .    .    .  |
 1   |  1.0  2.0  3.0  |   4.0  5.0   .   |    .    .    .  |
     |  1.0  2.0  3.0  |   4.0  5.0  6.0  |    .    .    .  |
     | ----------------|------------------|---------------- |
     |  1.0  2.0  3.0  |   4.0  5.0  6.0  |   7.0   .    .  |
 2   |  1.0  2.0  3.0  |   4.0  5.0  6.0  |   7.0  8.0   .  |
     |  1.0  2.0  3.0  |   4.0  5.0  6.0  |   7.0  8.0  9.0 |
     *                                                      *

The following is the 2 × 2 process grid:

B,D  |   0 2   |   1 
-----| ------- |-----
0    |   P00   |  P01
2    |         |
-----| ------- |-----
1    |   P10   |  P11

Local arrays for A:

p,q  |               0                |        1
-----|--------------------------------|-----------------
     |  1.0   .    .    .    .    .   |    .    .    .
     |  1.0  2.0   .    .    .    .   |    .    .    .
     |  1.0  2.0  3.0   .    .    .   |    .    .    .
 0   |  1.0  2.0  3.0  7.0   .    .   |   4.0  5.0  6.0
     |  1.0  2.0  3.0  7.0  8.0   .   |   4.0  5.0  6.0
     |  1.0  2.0  3.0  7.0  8.0  9.0  |   4.0  5.0  6.0
-----|--------------------------------|-----------------
     |  1.0  2.0  3.0   .    .    .   |   4.0   .    .
 1   |  1.0  2.0  3.0   .    .    .   |   4.0  5.0   .
     |  1.0  2.0  3.0   .    .    .   |   4.0  5.0  6.0

Output:

Global real symmetric matrix A of order 9 with block size 3 × 3:

B,D          0                  1                  2
     *                                                      *
     |  1.0   .    .   |    .    .    .   |    .    .    .  |
 0   |  1.0  1.0   .   |    .    .    .   |    .    .    .  |
     |  1.0  1.0  1.0  |    .    .    .   |    .    .    .  |
     | ----------------|------------------|---------------- |
     |  1.0  1.0  1.0  |   1.0   .    .   |    .    .    .  |
 1   |  1.0  1.0  1.0  |   1.0  1.0   .   |    .    .    .  |
     |  1.0  1.0  1.0  |   1.0  1.0  1.0  |    .    .    .  |
     | ----------------|------------------|---------------- |
     |  1.0  1.0  1.0  |   1.0  1.0  1.0  |   1.0   .    .  |
 2   |  1.0  1.0  1.0  |   1.0  1.0  1.0  |   1.0  1.0   .  |
     |  1.0  1.0  1.0  |   1.0  1.0  1.0  |   1.0  1.0  1.0 |
     *                                                      *

The following is the 2 × 2 process grid:

B,D  |   0 2   |   1 
-----| ------- |-----
0    |   P00   |  P01
2    |         |
-----| ------- |-----
1    |   P10   |  P11

Local arrays for A:

p,q  |               0                |        1
-----|--------------------------------|-----------------
     |  1.0   .    .    .    .    .   |    .    .    .
     |  1.0  1.0   .    .    .    .   |    .    .    .
     |  1.0  1.0  1.0   .    .    .   |    .    .    .
 0   |  1.0  1.0  1.0  1.0   .    .   |   1.0  1.0  1.0
     |  1.0  1.0  1.0  1.0  1.0   .   |   1.0  1.0  1.0
     |  1.0  1.0  1.0  1.0  1.0  1.0  |   1.0  1.0  1.0
-----|--------------------------------|-----------------
     |  1.0  1.0  1.0   .    .    .   |   1.0   .    .
 1   |  1.0  1.0  1.0   .    .    .   |   1.0  1.0   .
     |  1.0  1.0  1.0   .    .    .   |   1.0  1.0  1.0

The value of info is 0 on all processes.

Example 2

This example factors a 9 × 9 positive definite complex Hermitian matrix using a 2 × 2 process grid.

Call Statements and Input
ORDER = 'R'
NPROW = 2
NPCOL = 2
CALL BLACS_GET (0, 0, ICONTXT)
CALL BLACS_GRIDINIT(ICONTXT, ORDER, NPROW, NPCOL)
CALL BLACS_GRIDINFO(ICONTXT, NPROW, NPCOL, MYROW, MYCOL)
 
               UPLO   N    A  IA  JA    DESC_A   INFO
                |     |    |   |   |      |       |
CALL PZPOTRF(  'L'  , 9  , A , 1 , 1 ,  DESC_A , INFO )


Desc_A
DTYPE_ 1
CTXT_ icontxt(IOBG39)
M_ 9
N_ 9
MB_ 3
NB_ 3
RSRC_ 0
CSRC_ 0
LLD_ See below(EPSSL39)

Notes:

  1. icontxt is the output of the BLACS_GRIDINIT call.

  2. Each process should set the LLD_ as follows:
    LLD_A = MAX(1,NUMROC(M_A, MB_A, MYROW, RSRC_A, NPROW))
    

    In this example, LLD_A = 6 on P00 and P01, and LLD_A = 3 on P10 and P11.

Global complex Hermitian matrix A of order 9 with block size 3 × 3:


B,D                     0                                        1                                        2
     *                                                                                                                        *
     |  (18.0, 0.0)      .           .       |        .           .           .       |        .           .           .      |
 0   |   (1.0, 1.0) (18.0, 0.0)      .       |        .           .           .       |        .           .           .      |
     |   (1.0, 1.0)  (3.0, 1.0) (18.0, 0.0)  |        .           .           .       |        .           .           .      |
     | --------------------------------------|----------------------------------------|-------------------------------------- |
     |   (1.0, 1.0)  (3.0, 1.0)  (5.0, 1.0)  |   (18.0, 0.0)      .           .       |        .           .           .      |
 1   |   (1.0, 1.0)  (3.0, 1.0)  (5.0, 1.0)  |    (7.0, 1.0) (18.0, 0.0)      .       |        .           .           .      |
     |   (1.0, 1.0)  (3.0, 1.0)  (5.0, 1.0)  |    (7.0, 1.0)  (9.0, 1.0) (18.0, 0.0)  |        .           .           .      |
     | --------------------------------------|----------------------------------------|-------------------------------------- |
     |   (1.0, 1.0)  (3.0, 1.0)  (5.0, 1.0)  |    (7.0, 1.0)  (9.0, 1.0) (11.0, 1.0)  |   (18.0, 0.0)      .           .      |
 2   |   (1.0, 1.0)  (3.0, 1.0)  (5.0, 1.0)  |    (7.0, 1.0)  (9.0, 1.0) (11.0, 1.0)  |   (13.0, 1.0) (18.0, 0.0)      .      |
     |   (1.0, 1.0)  (3.0, 1.0)  (5.0, 1.0)  |    (7.0, 1.0)  (9.0, 1.0) (11.0, 1.0)  |   (13.0, 1.0) (15.0, 1.0) (18.0, 0.0) |
     *                                                                                                                        *
Note:
On input, the imaginary parts of the diagonal elements of the complex Hermitian matrix A are assumed to be zero, so you do not have to set these values.

The following is the 2 × 2 process grid:

B,D  |   0 2   |   1 
-----| ------- |-----
0    |   P00   |  P01
2    |         |
-----| ------- |-----
1    |   P10   |  P11

Local arrays for A:


p,q  |                                       0                                        |                    1
-----|--------------------------------------------------------------------------------|-----------------------------------------
     |  (18.0,  . )       .            .            .            .            .       |        .            .            .
     |   (1.0, 1.0)  (18.0,  . )       .            .            .            .       |        .            .            .
     |   (1.0, 1.0)   (3.0, 1.0)  (18.0,  . )       .            .            .       |        .            .            .
 0   |   (1.0, 1.0)   (3.0, 1.0)   (5.0, 1.0)  (18.0,  . )       .            .       |    (7.0, 1.0)   (9.0, 1.0)  (11.0, 1.0)
     |   (1.0, 1.0)   (3.0, 1.0)   (5.0, 1.0)  (13.0, 1.0)  (18.0,  . )       .       |    (7.0, 1.0)   (9.0, 1.0)  (11.0, 1.0)
     |   (1.0, 1.0)   (3.0, 1.0)   (5.0, 1.0)  (13.0, 1.0)  (15.0, 1.0)  (18.0,  . )  |    (7.0, 1.0)   (9.0, 1.0)  (11.0, 1.0)
-----|--------------------------------------------------------------------------------|-----------------------------------------
     |   (1.0, 1.0)   (3.0, 1.0)   (5.0, 1.0)       .            .            .       |   (18.0,  . )       .            .
 1   |   (1.0, 1.0)   (3.0, 1.0)   (5.0, 1.0)       .            .            .       |    (7.0, 1.0)  (18.0,  . )       .
     |   (1.0, 1.0)   (3.0, 1.0)   (5.0, 1.0)       .            .            .       |    (7.0, 1.0)   (9.0, 1.0)  (18.0,  . )

Output:

Global complex Hermitian matrix A of order 9 with block size 3 × 3:


B,D                       0                                        1                                      2
*                                                                                                                        *
     |    (4.2, 0.0)      .            .        |     .            .            .        |     .          .           .       |
 0   |  (0.24, 0.24)   (4.2, 0.0)      .        |     .            .            .        |     .          .           .       |
     |  (0.24, 0.24) (0.68, 0.24)   (4.2, 0.0)  |     .            .            .        |     .          .           .       |
     | -----------------------------------------|----------------------------------------|----------------------------------- |
     |  (0.24, 0.24) (0.68, 0.24)  (1.1, 0.24)  | (4.0, 0.0)       .            .        |     .          .           .       |
 1   |  (0.24, 0.24) (0.68, 0.24)  (1.1, 0.24)  | (1.3, 0.25)   (3.8, 0.0)      .        |     .          .           .       |
     |  (0.24, 0.24) (0.68, 0.24)  (1.1, 0.24)  | (1.3, 0.25)  (1.4, 0.26)   (3.5, 0.0)  |     .          .           .       |
     | -----------------------------------------|----------------------------------------|----------------------------------- |
     |  (0.24, 0.24) (0.68, 0.24)  (1.1, 0.24)  | (1.3, 0.25)  (1.4, 0.26)  (1.5, 0.28)  |  (3.2, 0.0)    .           .       |
 2   |  (0.24, 0.24) (0.68, 0.24)  (1.1, 0.24)  | (1.3, 0.25)  (1.4, 0.26)  (1.5, 0.28)  | (1.6, 0.32) (2.7, 0.0)     .       |
     |  (0.24, 0.24) (0.68, 0.24)  (1.1, 0.24)  | (1.3, 0.25)  (1.4, 0.26)  (1.5, 0.28)  | (1.6, 0.32) (1.6, 0.37) (2.2, 0.0) |
     *                                                                                                                   *
Note:
On output, the imaginary parts of the diagonal elements of the matrix are set to zero.

The following is the 2 × 2 process grid:

B,D  |   0 2   |   1 
-----| ------- |-----
0    |   P00   |  P01
2    |         |
-----| ------- |-----
1    |   P10   |  P11

Local arrays for A:


p,q  |                                       0                                         |                     1
-----|---------------------------------------------------------------------------------|------------------------------------------
     |    (4.2, 0.0)      .            .            .            .            .        |        .            .            .
     |  (0.24, 0.24)   (4.2, 0.0)      .            .            .            .        |        .            .            .
     |  (0.24, 0.24) (0.68, 0.24)   (4.2, 0.0)      .            .            .        |        .            .            .
 0   |  (0.24, 0.24) (0.68, 0.24)  (1.1, 0.24)   (3.2, 0,0)      .            .        |    (1.3, 0.25)  (1.4, 0.26)  (1.5, 0.28)
     |  (0.24, 0.24) (0.68, 0.24)  (1.1, 0.24)  (1.6, 0.32)   (2.7, 0.0)      .        |    (1.3, 0.25)  (1.4, 0.26)  (1.5, 0.28)
     |  (0.24, 0.24) (0.68, 0.24)  (1.1, 0.24)  (1.6, 0.32)  (1.6, 0.37)   (2.2, 0.0)  |    (1.3, 0.25)  (1.4, 0.26)  (1.5, 0.28)
-----|---------------------------------------------------------------------------------|------------------------------------------
     |  (0.24, 0.24) (0.68, 0.24)  (1.1, 0.24)      .            .            .        |     (4.0, 0.0)      .            .
 1   |  (0.24, 0.24) (0.68, 0.24)  (1.1, 0.24)      .            .            .        |    (1.3, 0.25)   (3.8, 0.0       .
     |  (0.24, 0.24) (0.68, 0.24)  (1.1, 0.24)      .            .            .        |    (1.3, 0.25)  (1.4, 0.26)   (3.5, 0.0)

The value of info is 0 on all processes.


[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]