IBM Books

Parallel Engineering and Scientific Subroutine Library for AIX Version 2 Release 3: Guide and Reference

PDGESV and PZGESV--General Matrix Factorization and Solve

These subroutines solve the following systems of equations for multiple right-hand sides:

AX = B

In the formula above:

A represents the global general submatrix Aia:ia+n-1, ja:ja+n-1.
B represents the global general submatrix Bib:ib+n-1, jb:jb+nrhs-1 containing the right-hand sides in its columns.
X represents the global general submatrix Bib:ib+n-1, jb:jb+nrhs-1 containing the solution vectors in its columns.

If n = 0, no computation is performed and the subroutine returns after doing some parameter checking. See references [16], [18], [22], [36], and [37].

Table 58. Data Types

A, B ipvt Subroutine
Long-precision real Integer PDGESV
Long-precision complex Integer PZGESV

Syntax

Fortran CALL PDGESV | PZGESV (n, nrhs, a, ia, ja, desc_a, ipvt, b, ib, jb, desc_b, info)
C and C++ pdgesv | pzgesv (n, nrhs, a, ia, ja, desc_a, ipvt, b, ib, jb, desc_b, info);

On Entry

n
is the order of the submatrix A and the number of rows in submatrix B.

Scope: global

Specified as: a fullword integer; n >= 0.

nrhs
is the number of right-hand sides-- that is, the number of columns in submatrix B used in the computation.

Scope: global

Specified as: a fullword integer; nrhs >= 0.

a
is the local part of the global general matrix A, used in the system of equations. This identifies the first element of the local array A. This subroutine computes the location of the first element of the local subarray used, based on ia, ja, desc_a, p, q, myrow, and mycol; therefore, the leading LOCp(ia+n-1) by LOCq(ja+n-1) part of the local array A must contain the local pieces of the leading ia+n-1 by ja+n-1 part of the global matrix.

Scope: local

Specified as: an LLD_A by (at least) LOCq(N_A) array, containing numbers of the data type indicated in Table 58. Details about the square block-cyclic data distribution of global matrix A are stored in desc_a.

ia
is the row index of the global matrix A, identifying the first row of the submatrix A.

Scope: global

Specified as: a fullword integer; 1 <= ia <= M_A and ia+n-1 <= M_A.

ja
is the column index of the global matrix A, identifying the first column of the submatrix A.

Scope: global

Specified as: a fullword integer; 1 <= ja <= N_A and ja+n-1 <= N_A.

desc_a
is the array descriptor for global matrix A, described in the following table:
desc_a Name Description Limits Scope
1 DTYPE_A Descriptor type DTYPE_A=1 Global
2 CTXT_A BLACS context Valid value, as returned by BLACS_GRIDINIT or BLACS_GRIDMAP Global
3 M_A Number of rows in the global matrix If n = 0:
M_A >= 0
Otherwise:
M_A >= 1
Global
4 N_A Number of columns in the global matrix If n = 0:
N_A >= 0
Otherwise:
N_A >= 1
Global
5 MB_A Row block size MB_A >= 1 Global
6 NB_A Column block size NB_A >= 1 Global
7 RSRC_A The process row of the p × q grid over which the first row of the global matrix is distributed 0 <= RSRC_A < p Global
8 CSRC_A The process column of the p × q grid over which the first column of the global matrix is distributed 0 <= CSRC_A < q Global
9 LLD_A The leading dimension of the local array LLD_A >= max(1,LOCp(M_A)) Local

Specified as: an array of (at least) length 9, containing fullword integers.

ipvt
See On Return.

b
is the local part of the global general matrix B, containing the right-hand sides of the system. This identifies the first element of the local array B. This subroutine computes the location of the first element of the local subarray used, based on ib, jb, desc_b, p, q, myrow, and mycol; therefore, the leading LOCp(ib+n-1) by LOCq(jb+nrhs-1) part of the local array B must contain the local pieces of the leading ib+n-1 by jb+nrhs-1 part of the global matrix.

Scope: local

Specified as: an LLD_B by (at least) LOCq(N_B) array, containing numbers of the data type indicated in Table 58. Details about the block-cyclic data distribution of global matrix B are stored in desc_b.

ib
is the row index of the global matrix B, identifying the first row of the submatrix B.

Scope: global

Specified as: a fullword integer; 1 <= ib <= M_B and ib+n-1 <= M_B.

jb
is the column index of the global matrix B, identifying the first column of the submatrix B.

Scope: global

Specified as: a fullword integer; 1 <= jb <= N_B and jb+nrhs-1 <= N_B.

desc_b
is the array descriptor for global matrix B, described in the following table:
desc_b Name Description Limits Scope
1 DTYPE_B Descriptor type DTYPE_B=1 Global
2 CTXT_B BLACS context Valid value, as returned by BLACS_GRIDINIT or BLACS_GRIDMAP Global
3 M_B Number of rows in the global matrix If n = 0 or nrhs = 0:
M_B >= 0
Otherwise:
M_B >= 1
Global
4 N_B Number of columns in the global matrix If n = 0 or nrhs = 0:
N_B >= 0
Otherwise:
N_B >= 1
Global
5 MB_B Row block size MB_B >= 1 Global
6 NB_B Column block size NB_B >= 1 Global
7 RSRC_B The process row of the p × q grid over which the first row of the global matrix is distributed 0 <= RSRC_B < p Global
8 CSRC_B The process column of the p × q grid over which the first column of the global matrix is distributed 0 <= CSRC_B < q Global
9 LLD_B The leading dimension of the local array LLD_B >= max(1,LOCp(M_B)) Local

Specified as: an array of (at least) length 9, containing fullword integers.

info
See On Return.

On Return

a
is the updated local part of the global matrix A, containing the results of the factorization.

Scope: local

Returned as: an LLD_A by (at least) LOCq(N_A) array, containing numbers of the data type indicated in Table 58.

ipvt
is the local part of the global vector ipvt, containing the pivot indices. This identifies the first element of the local array IPVT. This subroutine computes the location of the first element of the local subarray used, based on ia, desc_a, p, and myrow; therefore, the leading LOCp(ia+m-1) part of the local array IPVT must contain the local pieces of the leading ia+m-1 part of the global vector.

A copy of the vector ipvt, with a block size of MB_A and global index ia, is returned to each column of the process grid. The process row over which the first row of ipvt is distributed is RSRC_A.

Scope: local

Returned as: an array of (at least) length LOCp(ia+m-1), containing fullword integers, where ia <= (pivoting indices) <= ia+m-1. Details about the block-cyclic data distribution of global vector ipvt are stored in desc_a.

b
is the updated local part of the global matrix B, containing the solution vectors.

Scope: local

Returned as: an LLD_B by (at least) LOCq(N_B) array, containing numbers of the data type indicated in Table 58.

info
has the following meaning:

If info = 0, global submatrix A is not singular, and the factorization and solve completed normally.

If info > 0, global submatrix A is singular; that is, one or more columns of L and the corresponding diagonal of U contain all zeros. All columns of L are checked. info is set equal to i, the first column of L with a corresponding U = 0 diagonal element, encountered at Aia+i-1, ja+i-1. The factorization is completed; however, the solution submatrix B is not computed.

Scope: global

Returned as: a fullword integer; info >= 0.

Notes and Coding Rules
  1. In your C program, argument info must be passed by reference.
  2. If n > 0 and nrhs = 0, only the factorization is computed.
  3. The matrices and vector must have no common elements; otherwise, results are unpredictable.
  4. The way these subroutines handle singularity differs from ScaLAPACK. These subroutines use the info argument to provide information about the singularity of A, like ScaLAPACK, but also provide an error message.
  5. The NUMROC utility subroutine can be used to determine the values of LOCp(M_) and LOCq(N_) used in the argument descriptions above. For details, see Determining the Number of Rows and Columns in Your Local Arrays and NUMROC--Compute the Number of Rows or Columns of a Block-Cyclically Distributed Matrix Contained in a Process.
  6. For suggested block sizes, see Coding Tips for Optimizing Parallel Performance.
  7. On both input and output, matrices A and B conform to ScaLAPACK format.
  8. The following values must be equal: CTXT_A = CTXT_B.
  9. The global general matrix A must be distributed using a square block-cyclic distribution; that is, MB_A = NB_A.
  10. The following block sizes must be equal: MB_A = MB_B.
  11. The global general matrix A must be aligned on a block row boundary; that is, ia-1 must be a multiple of MB_A.
  12. The block row offset of A must be equal to the block column offset of A; that is, mod(ia-1, MB_A) = mod(ja-1, NB_A).
  13. The block row offset of A must be equal to the block row offset of B; that is, mod(ia-1, MB_A) = mod(ib-1, MB_B).
  14. In the process grid, the process row containing the first row of the submatrix A must also contain the first row of the submatrix B; that is, iarow = ibrow, where:
    iarow = mod((((ia-1)/MB_A)+RSRC_A), p)
    ibrow = mod((((ib-1)/MB_B)+RSRC_B), p)
  15. There is no array descriptor for ipvt. It is a column-distributed vector with block size MB_A, local arrays of dimension LOCp(ia+m-1) by 1, and global index ia. A copy of this vector exists on each column of the process grid, and the process row over which the first column of ipvt is distributed is RSRC_A.

Error Conditions

Computational Errors

Matrix A is a singular matrix. For details, see the description of the info argument.

Resource Errors

Unable to allocate work space

Input-Argument and Miscellaneous Errors

Stage 1 

  1. DTYPE_A is invalid.
  2. DTYPE_B is invalid.

Stage 2 

  1. CTXT_A is invalid.

Stage 3 

  1. This subroutine was called from outside the process grid.

Stage 4 

  1. n < 0
  2. nrhs < 0
  3. M_A < 0 and n = 0; M_A < 1 otherwise
  4. N_A < 0 and n = 0; N_A < 1 otherwise
  5. ia < 1
  6. ja < 1
  7. MB_A < 1
  8. NB_A < 1
  9. RSRC_A < 0 or RSRC_A >= p
  10. CSRC_A < 0 or CSRC_A >= q
  11. M_B < 0 and (n = 0 or nrhs = 0); M_B < 1 otherwise
  12. N_B < 0 and (n = 0 or nrhs = 0); N_B < 1 otherwise
  13. ib < 1
  14. jb < 1
  15. MB_B < 1
  16. NB_B < 1
  17. RSRC_B < 0 or RSRC_B >= p
  18. CSRC_B < 0 or CSRC_B >= q
  19. CTXT_A <> CTXT_B

Stage 5 

    If n <> 0:

  1. ia > M_A
  2. ja > N_A
  3. ia+n-1 > M_A
  4. ja+n-1 > N_A

    If n <> 0 and nrhs <> 0:

  5. ib > M_B
  6. jb > N_B
  7. ib+n-1 > M_B
  8. jb+nrhs-1 > N_B

    In all cases:

  9. MB_A <> NB_A
  10. mod(ia-1, MB_A) <> mod(ja-1, NB_A)
  11. MB_B <> MB_A
  12. mod(ia-1, MB_A) <> mod(ib-1, MB_B).
  13. mod(ia-1, MB_A) <> 0
  14. In the process grid, the process row containing the first row of the submatrix A does not contain the first row of the submatrix B; that is, iarow <> ibrow, where:
    iarow = mod((((ia-1)/MB_A)+RSRC_A), p)
    ibrow = mod((((ib-1)/MB_B)+RSRC_B), p)

Stage 6 

  1. LLD_A < max(1, LOCp(M_A))
  2. LLD_B < max(1, LOCp(M_B))

    Each of the following global input arguments are checked to determine whether its value differs from the value specified on process P00:

  3. n differs.
  4. nrhs differs.
  5. ia differs.
  6. ja differs.
  7. DTYPE_A differs.
  8. M_A differs.
  9. N_A differs.
  10. MB_A differs.
  11. NB_A differs.
  12. RSRC_A differs.
  13. CSRC_A differs.
  14. ib differs.
  15. jb differs.
  16. DTYPE_B differs.
  17. M_B differs.
  18. N_B differs.
  19. MB_B differs.
  20. NB_B differs.
  21. RSRC_B differs.
  22. CSRC_B differs.

Example 1

This example solves the real system AX = B where A is a 9 × 9 real general matrix and B contains 5 right-hand sides using a 2 × 2 process grid. By specifying RSRC_A = 1, the rows of global matrix A and the elements of global vector ipvt are distributed over the process grid starting in the second row of the process grid.

This example uses a global submatrix B within a global matrix B by specifying ib = 1 and jb = 2.

By specifying RSRC_B = 1, the rows of global matrix B are distributed over the process grid starting in the second row of the process grid. In addition, by specifying CSRC_B = 1, the columns of global matrix B are distributed over the process grid starting in the second column of the process grid.

Call Statements and Input


ORDER = 'R'
NPROW = 2
NPCOL = 2
CALL BLACS_GET (0, 0, ICONTXT)
CALL BLACS_GRIDINIT(ICONTXT, ORDER, NPROW, NPCOL)
CALL BLACS_GRIDINFO(ICONTXT, NPROW, NPCOL, MYROW, MYCOL)
 
             N  NRHS  A  IA  JA   DESC_A   IPVT   B  IB   JB  DESC_B   INFO
             |   |    |   |   |     |       |     |   |   |     |       |
CALL PDGESV (9 , 5  , A , 1 , 1 , DESC_A , IPVT , B , 1 , 2 , DESC_B , INFO)


Desc_A Desc_B
DTYPE_ 1 1
CTXT_ icontxt(IOBG42) icontxt(IOBG42)
M_ 9 9
N_ 9 6
MB_ 3 3
NB_ 3 2
RSRC_ 1 1
CSRC_ 0 1
LLD_ See below(EPSSL42) See below(EPSSL42)

Notes:

  1. icontxt is the output of the BLACS_GRIDINIT call.

  2. Each process should set the LLD_ as follows:
    LLD_A = MAX(1,NUMROC(M_A, MB_A, MYROW, RSRC_A, NPROW))
    LLD_B = MAX(1,NUMROC(M_B, MB_B, MYROW, RSRC_B, NPROW))
    

    In this example, LLD_A = LLD_B = 3 on P00 and P01, and LLD_A = LLD_B = 6 on P10 and P11.

Global general 9 × 9 matrix A with block size 3 × 3:

B,D          0                  1                  2
     *                                                      *
     |  1.0  1.2  1.4  |   1.6  1.8  2.0  |   2.2  2.4  2.6 |
 0   |  1.2  1.0  1.2  |   1.4  1.6  1.8  |   2.0  2.2  2.4 |
     |  1.4  1.2  1.0  |   1.2  1.4  1.6  |   1.8  2.0  2.2 |
     | ----------------|------------------|---------------- |
     |  1.6  1.4  1.2  |   1.0  1.2  1.4  |   1.6  1.8  2.0 |
 1   |  1.8  1.6  1.4  |   1.2  1.0  1.2  |   1.4  1.6  1.8 |
     |  2.0  1.8  1.6  |   1.4  1.2  1.0  |   1.2  1.4  1.6 |
     | ----------------|------------------|---------------- |
     |  2.2  2.0  1.8  |   1.6  1.4  1.2  |   1.0  1.2  1.4 |
 2   |  2.4  2.2  2.0  |   1.8  1.6  1.4  |   1.2  1.0  1.2 |
     |  2.6  2.4  2.2  |   2.0  1.8  1.6  |   1.4  1.2  1.0 |
     *                                                      *

The following is the 2 × 2 process grid:

B,D  |    02   |   1 
-----| ------- |-----
1    |   P00   |  P01
-----| ------- |-----
0    |   P10   |  P11
2    |         |                                  
Note:
The first row of A begins in the second row of the process grid.

Local arrays for A:

p,q  |               0                |        1
-----|--------------------------------|-----------------
     |  1.6  1.4  1.2  1.6  1.8  2.0  |   1.0  1.2  1.4
 0   |  1.8  1.6  1.4  1.4  1.6  1.8  |   1.2  1.0  1.2
     |  2.0  1.8  1.6  1.2  1.4  1.6  |   1.5  1.3  1.0
-----|--------------------------------|-----------------
     |  1.0  1.2  1.4  2.2  2.4  2.6  |   1.6  1.8  2.0
     |  1.2  1.0  1.2  2.0  2.2  2.4  |   1.4  1.6  1.8
     |  1.4  1.2  1.0  1.8  2.0  2.2  |   1.2  1.4  1.6
 1   |  2.2  2.0  1.8  1.0  1.2  1.4  |   1.6  1.4  1.2
     |  2.4  2.2  2.0  1.2  1.0  1.2  |   1.8  1.6  1.4
     |  2.6  2.4  2.2  1.4  1.2  1.0  |   2.0  1.8  1.6

After the global matrix B is distributed over the process grid, only a portion of the global data structure is used--that is, global submatrix B. Following is the global 9 × 5 submatrix B, starting at row 1 and column 2 in global general 9 × 6 matrix B with block size 3 × 2:

B,D         0                1                 2
     *                                                 *
     |    .   93.0  |   186.0  279.0  |   372.0  465.0 |
 0   |    .   84.4  |   168.8  253.2  |   337.6  422.0 |
     |    .   76.6  |   153.2  229.8  |   306.4  383.0 |
     | -------------|-----------------|--------------- |
     |    .   70.0  |   140.0  210.0  |   280.0  350.0 |
 1   |    .   65.0  |   130.0  195.0  |   260.0  325.0 |
     |    .   62.0  |   124.0  186.0  |   248.0  310.0 |
     | -------------|-----------------|--------------- |
     |    .   61.4  |   122.8  184.2  |   245.6  307.0 |
 2   |    .   63.6  |   127.2  190.8  |   254.4  318.0 |
     |    .   69.0  |   138.0  207.0  |   276.0  345.0 |
     *                                                 *

The following is the 2 × 2 process grid:

B,D  |    1    | 0 2 
-----| ------- |-----
1    |   P00   |  P01
-----| ------- |-----
0    |   P10   |  P11
2    |         |
Note:
The first row of B begins in the second row of the process grid, and the first column of B begins in the second column of the process grid.

Local arrays for B:

p,q  |       0        |              1
-----|----------------|----------------------------
     |  140.0  210.0  |     .   70.0  280.0  350.0
 0   |  130.0  195.0  |     .   65.0  260.0  325.0
     |  124.0  186.0  |     .   62.0  248.0  310.0
-----|----------------|----------------------------
     |  186.0  279.0  |     .   93.0  372.0  465.0
     |  168.8  253.2  |     .   84.4  337.6  422.0
     |  153.2  229.8  |     .   76.6  306.4  383.0
 1   |  122.8  184.2  |     .   61.4  245.6  307.0
     |  127.2  190.8  |     .   63.6  254.4  318.0
     |  138.0  207.0  |     .   69.0  276.0  345.0

Output:

Global general 9 × 9 transformed matrix A with block size 3 × 3:

B,D          0                  1                  2
     *                                                      *
     |  2.6  2.4  2.2  |   2.0  1.8  1.6  |   1.4  1.2  1.0 |
 0   |  0.4  0.3  0.6  |   0.8  1.1  1.4  |   1.7  1.9  2.2 |
     |  0.5 -0.4  0.4  |   0.8  1.2  1.6  |   2.0  2.4  2.8 |
     | ----------------|------------------|---------------- |
     |  0.5 -0.3  0.0  |   0.4  0.8  1.2  |   1.6  2.0  2.4 |
 1   |  0.6 -0.3  0.0  |   0.0  0.4  0.8  |   1.2  1.6  2.0 |
     |  0.7 -0.2  0.0  |   0.0  0.0  0.4  |   0.8  1.2  1.6 |
     | ----------------|------------------|---------------- |
     |  0.8 -0.2  0.0  |   0.0  0.0  0.0  |   0.4  0.8  1.2 |
 2   |  0.8 -0.1  0.0  |   0.0  0.0  0.0  |   0.0  0.4  0.8 |
     |  0.9 -0.1  0.0  |   0.0  0.0  0.0  |   0.0  0.0  0.4 |
     *                                                      *

The following is the 2 × 2 process grid:

B,D  |   0 2   |  1  
-----| ------- |-----
1    |   P00   |  P01
-----| ------- |-----
0    |   P10   |  P11
2    |         |
Note:
The first row of A begins in the second row of the process grid.

Local arrays for A:

p,q  |               0                |        1
-----|--------------------------------|-----------------
     |  0.5 -0.3  0.0  1.6  2.0  2.4  |   0.4  0.8  1.2
 0   |  0.6 -0.3  0.0  1.2  1.6  2.0  |   0.0  0.4  0.8
     |  0.7 -0.2  0.0  0.8  1.2  1.6  |   0.0  0.0  0.4
-----|--------------------------------|-----------------
     |  2.6  2.4  2.2  1.4  1.2  1.0  |   2.0  1.8  1.6
     |  0.4  0.3  0.6  1.7  1.9  2.2  |   0.8  1.1  1.4
     |  0.5 -0.4  0.4  2.0  2.4  2.8  |   0.8  1.2  1.6
 1   |  0.8 -0.2  0.0  0.4  0.8  1.2  |   0.0  0.0  0.0
     |  0.8 -0.1  0.0  0.0  0.4  0.8  |   0.0  0.0  0.0
     |  0.9 -0.1  0.0  0.0  0.0  0.4  |   0.0  0.0  0.0

Global vector ipvt of length 9 with block size 3:

B,D    0
     *    *
     |  9 |
 0   |  9 |
     |  9 |
     | -- |
     |  9 |
 1   |  9 |
     |  9 |
     | -- |
     |  9 |
 2   |  9 |
     |  9 |
     *    *

Note:
A copy of ipvt is distributed across each column of the process grid.

The following is the 2 × 2 process grid:

B,D  |         |     
-----| ------- |-----
1    |   P00   |  P01
-----| ------- |-----
0    |   P10   |  P11
2    |         |
Note:
The first row of ipvt begins in the second row of the process grid.

Local arrays for ipvt:

p,q  |  0  |   1
-----|-----|-----
     |  9  |   9
 0   |  9  |   9
     |  9  |   9
-----|-----|-----
     |  9  |   9
     |  9  |   9
     |  9  |   9
 1   |  9  |   9
     |  9  |   9
     |  9  |   9

After the global matrix B is distributed over the process grid, only a portion of the global data structure is used--that is, global submatrix B. Following is the global 9 × 5 submatrix B, starting at row 1 and column 2 in global general 9 × 6 matrix B with block size 3 × 2:

B,D        0              1               2
     *                                           *
     |   .   1.0  |    2.0   3.0  |    4.0   5.0 |
 0   |   .   2.0  |    4.0   6.0  |    8.0  10.0 |
     |   .   3.0  |    6.0   9.0  |   12.0  15.0 |
     | -----------|---------------|------------- |
     |   .   4.0  |    8.0  12.0  |   16.0  20.0 |
 1   |   .   5.0  |   10.0  15.0  |   20.0  25.0 |
     |   .   6.0  |   12.0  18.0  |   24.0  30.0 |
     | -----------|---------------|------------- |
     |   .   7.0  |   14.0  21.0  |   28.0  35.0 |
 2   |   .   8.0  |   16.0  24.0  |   32.0  40.0 |
     |   .   9.0  |   18.0  27.0  |   36.0  45.0 |
     *                                           *

The following is the 2 × 2 process grid:

B,D  |    1    | 0 2 
-----| ------- |-----
1    |   P00   |  P01
-----| ------- |-----
0    |   P10   |  P11
2    |         |
Note:
The first row of B begins in the second row of the process grid, and the first column of B begins in the second column of the process grid.

Local arrays for B:

p,q  |      0       |            1
-----|--------------|------------------------
     |   8.0  12.0  |    .   4.0  16.0  20.0
 0   |  10.0  15.0  |    .   5.0  20.0  25.0
     |  12.0  18.0  |    .   6.0  24.0  30.0
-----|--------------|------------------------
     |   2.0   3.0  |    .   1.0   4.0   5.0
     |   4.0   6.0  |    .   2.0   8.0  10.0
     |   6.0   9.0  |    .   3.0  12.0  15.0
 1   |  14.0  21.0  |    .   7.0  28.0  35.0
     |  16.0  24.0  |    .   8.0  32.0  40.0
     |  18.0  27.0  |    .   9.0  36.0  45.0

The value of info is 0 on all processes.

Example 2

This example solves the complex system AX = B where A is a 9 × 9 complex general matrix and B contains 5 right-hand sides using a 2 × 2 process grid. By specifying RSRC_A = 1, the rows of global matrix A and the elements of global vector ipvt are distributed over the process grid starting in the second row of the process grid.

This example uses a global submatrix B within a global matrix B by specifying ib = 1 and jb = 2.

By specifying RSRC_B = 1, the rows of global matrix B are distributed over the process grid starting in the second row of the process grid. In addition, by specifying CSRC_B = 1, the columns of global matrix B are distributed over the process grid starting in the second column of the process grid.

Call Statements and Input


ORDER = 'R'
NPROW = 2
NPCOL = 2
CALL BLACS_GET (0, 0, ICONTXT)
CALL BLACS_GRIDINIT(ICONTXT, ORDER, NPROW, NPCOL)
CALL BLACS_GRIDINFO(ICONTXT, NPROW, NPCOL, MYROW, MYCOL)
 
             N  NRHS  A  IA  JA   DESC_A   IPVT   B  IB  JB   DESC_B   INFO
             |   |    |   |   |     |       |     |   |   |     |       |
CALL PZGESV (9 , 5  , A , 1 , 1 , DESC_A , IPVT , B , 1 , 2 , DESC_B , INFO)


Desc_A Desc_B
DTYPE_ 1 1
CTXT_ icontxt1 icontxt1
M_ 9 9
N_ 9 6
MB_ 3 3
NB_ 3 2
RSRC_ 1 1
CSRC_ 0 1
LLD_ See below2 See below2

Notes:

  1. icontxt is the output of the BLACS_GRIDINIT call.

  2. Each process should set the LLD_ as follows:
    LLD_A = MAX(1,NUMROC(M_A, MB_A, MYROW, RSRC_A, NPROW))
    LLD_B = MAX(1,NUMROC(M_B, MB_B, MYROW, RSRC_B, NPROW))
    

    In this example, LLD_A = LLD_B = 3 on P00 and P01, and LLD_A = LLD_B = 6 on P10 and P11.

Global general 9 × 9 matrix A with block size 3 × 3:


B,D                     0                                       1                                       2
     *                                                                                                                     *
     |  (2.0, 1.0)  (2.4,-1.0)  (2.8,-1.0)  |   (3.2,-1.0)  (3.6,-1.0)  (4.0,-1.0)  |   (4.4,-1.0)  (4.8,-1.0)  (5.2,-1.0) |
 0   |  (2.4, 1.0)  (2.0, 1.0)  (2.4,-1.0)  |   (2.8,-1.0)  (3.2,-1.0)  (3.6,-1.0)  |   (4.0,-1.0)  (4.4,-1.0)  (4.8,-1.0) |
     |  (2.8, 1.0)  (2.4, 1.0)  (2.0, 1.0)  |   (2.4,-1.0)  (2.8,-1.0)  (3.2,-1.0)  |   (3.6,-1.0)  (4.0,-1.0)  (4.4,-1.0) |
     | -------------------------------------|---------------------------------------|------------------------------------- |
     |  (3.2, 1.0)  (2.8, 1.0)  (2.4, 1.0)  |   (2.0, 1.0)  (2.4,-1.0)  (2.8,-1.0)  |   (3.2,-1.0)  (3.6,-1.0)  (4.0,-1.0) |
 1   |  (3.6, 1.0)  (3.2, 1.0)  (2.8, 1.0)  |   (2.4, 1.0)  (2.0, 1.0)  (2.4,-1.0)  |   (2.8,-1.0)  (3.2,-1.0)  (3.6,-1.0) |
     |  (4.0, 1.0)  (3.6, 1.0)  (3.2, 1.0)  |   (2.8, 1.0)  (2.4, 1.0)  (2.0, 1.0)  |   (2.4,-1.0)  (2.8,-1.0)  (3.2,-1.0) |
     | -------------------------------------|---------------------------------------|------------------------------------- |
     |  (4.4, 1.0)  (4.0, 1.0)  (3.6, 1.0)  |   (3.2, 1.0)  (2.8, 1.0)  (2.4, 1.0)  |   (2.0, 1.0)  (2.4,-1.0)  (2.8,-1.0) |
 2   |  (4.8, 1.0)  (4.4, 1.0)  (4.0, 1.0)  |   (3.6, 1.0)  (3.2, 1.0)  (2.8, 1.0)  |   (2.4, 1.0)  (2.0, 1.0)  (2.4,-1.0) |
     |  (5.2, 1.0)  (4.8, 1.0)  (4.4, 1.0)  |   (4.0, 1.0)  (3.6, 1.0)  (3.2, 1.0)  |   (2.8, 1.0)  (2.4, 1.0)  (2.0, 1.0) |
     *                                                                                                                     *

The following is the 2 × 2 process grid:

B,D  |   0 2   |   1 
-----| ------- |-----
1    |   P00   |  P01
-----| ------- |-----
0    |   P10   |  P11
2    |         |
Note:
The first row of A begins in the second row of the process grid.

Local arrays for A:


p,q  |                                    0                                     |                   1
-----|--------------------------------------------------------------------------|--------------------------------------
     |  (3.2, 1.0)  (2.8, 1.0)  (2.4, 1.0)  (3.2,-1.0)  (3.6,-1.0)  (4.0,-1.0)  |   (2.0, 1.0)  (2.4,-1.0)  (2.8,-1.0)
 0   |  (3.6, 1.0)  (3.2, 1.0)  (2.8, 1.0)  (2.8,-1.0)  (3.2,-1.0)  (3.6,-1.0)  |   (2.4, 1.0)  (2.0, 1.0)  (2.4,-1.0)
     |  (4.0, 1.0)  (3.6, 1.0)  (3.2, 1.0)  (2.4,-1.0)  (2.8,-1.0)  (3.2,-1.0)  |   (2.8, 1.0)  (2.4, 1.0)  (2.0, 1.0)
-----|--------------------------------------------------------------------------|--------------------------------------
     |  (2.0, 1.0)  (2.4,-1.0)  (2.8,-1.0)  (4.4,-1.0)  (4.8,-1.0)  (5.2,-1.0)  |   (3.2,-1.0)  (3.6,-1.0)  (4.0,-1.0)
     |  (2.4, 1.0)  (2.0, 1.0)  (2.4,-1.0)  (4.0,-1.0)  (4.4,-1.0)  (4.8,-1.0)  |   (2.8,-1.0)  (3.2,-1.0)  (3.6,-1.0)
     |  (2.8, 1.0)  (2.4, 1.0)  (2.0, 1.0)  (3.6,-1.0)  (4.0,-1.0)  (4.4,-1.0)  |   (2.4,-1.0)  (2.8,-1.0)  (3.2,-1.0)
 1   |  (4.4, 1.0)  (4.0, 1.0)  (3.6, 1.0)  (2.0, 1.0)  (2.4,-1.0)  (2.8,-1.0)  |   (3.2, 1.0)  (2.8, 1.0)  (2.4, 1.0)
     |  (4.8, 1.0)  (4.4, 1.0)  (4.0, 1.0)  (2.4, 1.0)  (2.0, 1.0)  (2.4,-1.0)  |   (3.6, 1.0)  (3.2, 1.0)  (2.8, 1.0)
     |  (5.2, 1.0)  (4.8, 1.0)  (4.4, 1.0)  (2.8, 1.0)  (2.4, 1.0)  (2.0, 1.0)  |   (4.0, 1.0)  (3.6, 1.0)  (3.2, 1.0)

After the global matrix B is distributed over the process grid, only a portion of the global data structure is used--that is, global submatrix B. Following is the global 9 × 5 submatrix B, starting at row 1 and column 2 in global general 9 × 6 matrix B with block size 3 × 2:


B,D             0                             1                                 2
     *                                                                                          *
     |    .   (193.0,-10.6)  |   (200.0, 21.8)  (207.0, 54.2)  |   (214.0, 86.6)  (221.0,119.0) |
 0   |    .   (173.8, -9.4)  |   (178.8, 20.2)  (183.8, 49.8)  |   (188.8, 79.4)  (193.8,109.0) |
     |    .   (156.2, -5.4)  |   (159.2, 22.2)  (162.2, 49.8)  |   (165.2, 77.4)  (168.2,105.0) |
     | ----------------------|---------------------------------|------------------------------- |
     |    .   (141.0,  1.4)  |   (142.0, 27.8)  (143.0, 54.2)  |   (144.0, 80.6)  (145.0,107.0) |
 1   |    .   (129.0, 11.0)  |   (128.0, 37.0)  (127.0, 63.0)  |   (126.0, 89.0)  (125.0,115.0) |
     |    .   (121.0, 23.4)  |   (118.0, 49.8)  (115.0, 76.2)  |   (112.0,102.6)  (109.0,129.0) |
     | ----------------------|---------------------------------|------------------------------- |
     |    .   (117.8, 38.6)  |   (112.8, 66.2)  (107.8, 93.8)  |   (102.8,121.4)   (97.8,149.0) |
 2   |    .   (120.2, 56.6)  |   (113.2, 86.2)  (106.2,115.8)  |    (99.2,145.4)   (92.2,175.0) |
     |    .   (129.0, 77.4)  |   (120.0,109.8)  (111.0,142.2)  |   (102.0,174.6)   (93.0,207.0) |
     *                                                                                          *

The following is the 2 × 2 process grid:

B,D  |    1    | 0 2 
-----| ------- |-----
1    |   P00   |  P01
-----| ------- |-----
0    |   P10   |  P11
2    |         |
Note:
The first row of B begins in the second row of the process grid, and the first column of B begins in the second column of the process grid.

Local arrays for B:


p,q  |               0                |                          1
-----|--------------------------------|-----------------------------------------------------
     |  (142.0, 27.8)  (143.0, 54.2)  |     .   (141.0,  1.4)  (144.0, 80.6)  (145.0,107.0)
 0   |  (128.0, 37.0)  (127.0, 63.0)  |     .   (129.0, 11.0)  (126.0, 89.0)  (125.0,115.0)
     |  (118.0, 49.8)  (115.0, 76.2)  |     .   (121.0, 23.4)  (112.0,102.6)  (109.0,129.0)
-----|--------------------------------|-----------------------------------------------------
     |  (200.0, 21.8)  (207.0, 54.2)  |     .   (193.0,-10.6)  (214.0, 86.6)  (221.0,119.0)
     |  (178.8, 20.2)  (183.8, 49.8)  |     .   (173.8, -9.4)  (188.8, 79.4)  (193.8,109.0)
     |  (159.2, 22.2)  (162.2, 49.8)  |     .   (156.2, -5.4)  (165.2, 77.4)  (168.2,105.0)
 1   |  (112.8, 66.2)  (107.8, 93.8)  |     .   (117.8, 38.6)  (102.8,121.4)   (97.8,149.0)
     |  (113.2, 86.2)  (106.2,115.8)  |     .   (120.2, 56.6)   (99.2,145.4)   (92.2,175.0)
     |  (120.0,109.8)  (111.0,142.2)  |     .   (129.0, 77.4)  (102.0,174.6)   (93.0,207.0)

Output:

Global general 9 × 9 transformed matrix A with block size 3 × 3:


B,D                     0                                         1                                        2
     *                                                                                                                        *
     |  (5.2, 1.0)  (4.8, 1.0)   (4.4, 1.0)  |    (4.0, 1.0)   (3.6, 1.0)  (3.2, 1.0)  |   (2.8, 1.0)  (2.4, 1.0)  (2.0, 1.0) |
 0   |  (0.4, 0.1)  (0.6,-2.0)   (1.1,-1.9)  |    (1.7,-1.9)   (2.3,-1.8)  (2.8,-1.8)  |   (3.4,-1.7)  (3.9,-1.7)  (4.5,-1.6) |
     |  (0.5, 0.1)  (0.0,-0.1)   (0.6,-1.9)  |    (1.2,-1.8)   (1.8,-1.7)  (2.5,-1.6)  |   (3.1,-1.5)  (3.7,-1.4)  (4.3,-1.3) |
     | --------------------------------------|-----------------------------------------|------------------------------------- |
     |  (0.6, 0.1)  (0.0,-0.1)  (-0.1,-0.1)  |    (0.7,-1.9)   (1.3,-1.7)  (2.0,-1.6)  |   (2.7,-1.5)  (3.4,-1.4)  (4.0,-1.2) |
 1   |  (0.6, 0.1)  (0.0,-0.1)  (-0.1,-0.1)  |   (-0.1, 0.0)   (0.7,-1.9)  (1.5,-1.7)  |   (2.2,-1.6)  (2.9,-1.5)  (3.7,-1.3) |
     |  (0.7, 0.1)  (0.0,-0.1)   (0.0, 0.0)  |   (-0.1, 0.0)  (-0.1, 0.0)  (0.8,-1.9)  |   (1.6,-1.8)  (2.4,-1.6)  (3.2,-1.5) |
     | --------------------------------------|-----------------------------------------|------------------------------------- |
     |  (0.8, 0.0)  (0.0, 0.0)   (0.0, 0.0)  |    (0.0, 0.0)   (0.0, 0.0)  (0.0, 0.0)  |   (0.8,-1.9)  (1.7,-1.8)  (2.5,-1.8) |
 2   |  (0.9, 0.0)  (0.0, 0.0)   (0.0, 0.0)  |    (0.0, 0.0)   (0.0, 0.0)  (0.0, 0.0)  |   (0.0, 0.0)  (0.8,-2.0)  (1.7,-1.9) |
     |  (0.9, 0.0)  (0.0, 0.0)   (0.0, 0.0)  |    (0.0, 0.0)   (0.0, 0.0)  (0.0, 0.0)  |   (0.0, 0.0)  (0.0, 0.0)  (0.8,-2.0) |
     *                                                                                                                        *

The following is the 2 × 2 process grid:

B,D  |   0 2   |   1 
-----| ------- |-----
1    |   P00   |  P01
-----| ------- |-----
0    |   P10   |  P11
2    |         |
Note:
The first row of A begins in the second row of the process grid.

Local arrays for A:


p,q  |                                    0                                      |                    1
-----|---------------------------------------------------------------------------|----------------------------------------
     |  (0.6, 0.1)  (0.0,-0.1)  (-0.1,-0.1)  (2.7,-1.5)  (3.4,-1.4)  (4.0,-1.2)  |    (0.7,-1.9)   (1.3,-1.7)  (2.0,-1.6)
 0   |  (0.6, 0.1)  (0.0,-0.1)  (-0.1,-0.1)  (2.2,-1.6)  (2.9,-1.5)  (3.7,-1.3)  |   (-0.1, 0.0)   (0.7,-1.9)  (1.5,-1.7)
     |  (0.7, 0.1)  (0.0,-0.1)   (0.0, 0.0)  (1.6,-1.8)  (2.4,-1.6)  (3.2,-1.5)  |   (-0.1, 0.0)  (-0.1, 0.0)  (0.8,-1.9)
-----|---------------------------------------------------------------------------|----------------------------------------
     |  (5.2, 1.0)  (4.8, 1.0)   (4.4, 1.0)  (2.8, 1.0)  (2.4, 1.0)  (2.0, 1.0)  |    (4.0, 1.0)   (3.6, 1.0)  (3.2, 1.0)
     |  (0.4, 0.1)  (0.6,-2.0)   (1.1,-1.9)  (3.4,-1.7)  (3.9,-1.7)  (4.5,-1.6)  |    (1.7,-1.9)   (2.3,-1.8)  (2.8,-1.8)
     |  (0.5, 0.1)  (0.0,-0.1)   (0.6,-1.9)  (3.1,-1.5)  (3.7,-1.4)  (4.3,-1.3)  |    (1.2,-1.8)   (1.8,-1.7)  (2.5,-1.6)
 1   |  (0.8, 0.0)  (0.0, 0.0)   (0.0, 0.0)  (0.8,-1.9)  (1.7,-1.8)  (2.5,-1.8)  |    (0.0, 0.0)   (0.0, 0.0)  (0.0, 0.0)
     |  (0.9, 0.0)  (0.0, 0.0)   (0.0, 0.0)  (0.0, 0.0)  (0.8,-2.0)  (1.7,-1.9)  |    (0.0, 0.0)   (0.0, 0.0)  (0.0, 0.0)
     |  (0.9, 0.0)  (0.0, 0.0)   (0.0, 0.0)  (0.0, 0.0)  (0.0, 0.0)  (0.8,-2.0)  |    (0.0, 0.0)   (0.0, 0.0)  (0.0, 0.0)

Global vector ipvt of length 9 with block size 3:

B,D     0
     *    *
     |  9 |
 0   |  9 |
     |  9 |
     | -- |
     |  9 |
 1   |  9 |
     |  9 |
     | -- |
     |  9 |
 2   |  9 |
     |  9 |
     *    *

Note:
A copy of ipvt is distributed across each column of the process grid.

The following is the 2 × 2 process grid:

B,D  |   0 2   |   1 
-----| ------- |-----
1    |   P00   |  P01
-----| ------- |-----
0    |   P10   |  P11
2    |         |
Note:
The first row of ipvt begins in the second row of the process grid.

Local arrays for ipvt:

p,q  |  0  |   1
-----|-----|-----
     |  9  |   9
 0   |  9  |   9
     |  9  |   9
-----|-----|-----
     |  9  |   9
     |  9  |   9
     |  9  |   9
 1   |  9  |   9
     |  9  |   9
     |  9  |   9

After the global matrix B is distributed over the process grid, only a portion of the global data structure is used--that is, global submatrix B. Following is the global 9 × 5 submatrix B, starting at row 1 and column 2 in global general 9 × 6 matrix B with block size 3 × 2:


B,D           0                        1                           2
     *                                                                          *
     |   .   (1.0, 1.0)  |   (1.0, 2.0)  (1.0, 3.0)  |   (1.0, 4.0)  (1.0, 5.0) |
 0   |   .   (2.0, 1.0)  |   (2.0, 2.0)  (2.0, 3.0)  |   (2.0, 4.0)  (2.0, 5.0) |
     |   .   (3.0, 1.0)  |   (3.0, 2.0)  (3.0, 3.0)  |   (3.0, 4.0)  (3.0, 5.0) |
     | ------------------|---------------------------|------------------------- |
     |   .   (4.0, 1.0)  |   (4.0, 2.0)  (4.0, 3.0)  |   (4.0, 4.0)  (4.0, 5.0) |
 1   |   .   (5.0, 1.0)  |   (5.0, 2.0)  (5.0, 3.0)  |   (5.0, 4.0)  (5.0, 5.0) |
     |   .   (6.0, 1.0)  |   (6.0, 2.0)  (6.0, 3.0)  |   (6.0, 4.0)  (6.0, 5.0) |
     | ------------------|---------------------------|------------------------- |
     |   .   (7.0, 1.0)  |   (7.0, 2.0)  (7.0, 3.0)  |   (7.0, 4.0)  (7.0, 5.0) |
 2   |   .   (8.0, 1.0)  |   (8.0, 2.0)  (8.0, 3.0)  |   (8.0, 4.0)  (8.0, 5.0) |
     |   .   (9.0, 1.0)  |   (9.0, 2.0)  (9.0, 3.0)  |   (9.0, 4.0)  (9.0, 5.0) |
     *                                                                          *

The following is the 2 × 2 process grid:

B,D  |    1    | 0 2 
-----| ------- |-----
1    |   P00   |  P01
-----| ------- |-----
0    |   P10   |  P11
2    |         |
Note:
The first row of B begins in the second row of the process grid, and the first column of B begins in the second column of the process grid.

Local arrays for B:


p,q  |            0             |                     1
-----|--------------------------|-------------------------------------------
     |  (3.0, 2.0)  (3.0, 3.0)  |    .   (3.0, 1.0)  (3.0, 4.0)  (3.0, 5.0)
 0   |  (4.0, 2.0)  (4.0, 3.0)  |    .   (4.0, 1.0)  (4.0, 4.0)  (4.0, 5.0)
     |  (5.0, 2.0)  (5.0, 3.0)  |    .   (5.0, 1.0)  (5.0, 4.0)  (5.0, 5.0)
-----|--------------------------|-------------------------------------------
     |  (1.0, 2.0)  (1.0, 3.0)  |    .   (1.0, 1.0)  (1.0, 4.0)  (1.0, 5.0)
     |  (2.0, 2.0)  (2.0, 3.0)  |    .   (2.0, 1.0)  (2.0, 4.0)  (2.0, 5.0)
     |  (3.0, 2.0)  (3.0, 3.0)  |    .   (3.0, 1.0)  (3.0, 4.0)  (3.0, 5.0)
 1   |  (7.0, 2.0)  (7.0, 3.0)  |    .   (7.0, 1.0)  (7.0, 4.0)  (7.0, 5.0)
     |  (8.0, 2.0)  (8.0, 3.0)  |    .   (8.0, 1.0)  (8.0, 4.0)  (8.0, 5.0)
     |  (9.0, 2.0)  (9.0, 3.0)  |    .   (9.0, 1.0)  (9.0, 4.0)  (9.0, 5.0)

The value of info is 0 on all processes.


[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]