IBM Books

Parallel Engineering and Scientific Subroutine Library for AIX Version 2 Release 3: Guide and Reference

PDSYMM, PZSYMM, and PZHEMM--Matrix-Matrix Product Where One Matrix is Real or Complex Symmetric or Complex Hermitian

These subroutines compute one of the following matrix-matrix products:

1. C<--alphaAB+betaC
2. C<--alphaBA+betaC

where, in the formulas above:

A represents the global submatrix:
B represents the global general submatrix Bib:ib+m-1, jb:jb+n-1.
C represents the global general submatrix Cic:ic+m-1, jc:jc+n-1.
alpha and beta are scalars.

and:

In the following two cases, no computation is performed and the subroutine returns after doing some parameter checking:

See references [14] and [15].

Table 48. Data Types

alpha, beta, A, B, C Subprogram
Long-precision real PDSYMM
Long-precision complex PZSYMM and PZHEMM

Syntax

Fortran CALL PDSYMM | PZSYMM | PZHEMM (side, uplo, m, n, alpha, a, ia, ja, desc_a, b, ib, jb, desc_b, beta, c, ic, jc, desc_c)
C and C++ pdsymm | pzsymm | pzhemm (side, uplo, m, n, alpha, a, ia, ja, desc_a, b, ib, jb, desc_b, beta, c, ic, jc, desc_c);

On Entry

side
indicates whether A is located to the left or right of B in the equation used for this computation, where:

If side = 'L', A is to the left of B, resulting in equation 1.

If side = 'R', A is to the right of B, resulting in equation 2.

Scope: global

Specified as: a single character; side = 'L' or 'R'.

uplo
indicates whether the upper or lower triangular part of the global submatrix A is referenced, where:

If uplo = 'U', the upper triangular part is referenced.

If uplo = 'L', the lower triangular part is referenced.

Scope: global

Specified as: a single character; uplo = 'U' or 'L'.

m
is the number of rows in submatrices B and C used in the computation, and:

If side = 'L', it is the number of rows and columns in submatrix A used in the computation.

Scope: global

Specified as: a fullword integer; m >= 0.

n
is the number of columns in submatrices B and C used in the computation, and:

If side = 'R', it is the number of rows and columns in submatrix A used in the computation.

Scope: global

Specified as: a fullword integer; n >= 0.

alpha
is the scalar alpha.

Scope: global

Specified as: a number of the data type indicated in Table 48.

a
is the local part of the global real symmetric, complex symmetric, or complex Hermitian matrix A. This identifies the first element of the local array A. This subroutine computes the location of the first element of the local subarray used, based on ia, ja, desc_a, p, q, myrow, and mycol; therefore, assuming the following:

If side = 'L', numa = m

If side = 'R', numa = n

the leading LOCp(ia+numa-1) by LOCq(ja+numa-1) part of the local array A must contain the local pieces of the leading ia+numa-1 by ja+numa-1 part of the global matrix, and:

Scope: local

Specified as: an LLD_A by (at least) LOCq(N_A) array, containing numbers of the data type indicated in Table 48. Details about the block-cyclic data distribution of global matrix A are stored in desc_a.

ia
is the row index of the global matrix A, identifying the first row of the submatrix A.

Scope: global

Specified as: a fullword integer; 1 <= ia <= M_A and ia+numa-1 <= M_A.

ja
is the column index of the global matrix A, identifying the first column of the submatrix A.

Scope: global

Specified as: a fullword integer; 1 <= ja <= N_A and ja+numa-1 <= N_A.

desc_a
is the array descriptor for global matrix A, described in the following table:
desc_a Name Description Limits Scope
1 DTYPE_A Descriptor type DTYPE_A=1 Global
2 CTXT_A BLACS context Valid value, as returned by BLACS_GRIDINIT or BLACS_GRIDMAP Global
3 M_A Number of rows in the global matrix If m = 0 and side = 'L'
or n = 0 and side = 'R':
M_A >= 0
Otherwise:
M_A >= 1
Global
4 N_A Number of columns in the global matrix If m = 0 and side = 'L'
or n = 0 and side = 'R':
N_A >= 0
Otherwise:
N_A >= 1
Global
5 MB_A Row block size MB_A >= 1 Global
6 NB_A Column block size NB_A >= 1 Global
7 RSRC_A The process row of the p × q grid over which the first row of the global matrix is distributed 0 <= RSRC_A < p Global
8 CSRC_A The process column of the p × q grid over which the first column of the global matrix is distributed 0 <= CSRC_A < q Global
9 LLD_A The leading dimension of the local array LLD_A >= max(1,LOCp(M_A)) Local

Specified as: an array of (at least) length 9, containing fullword integers.

b
is the local part of the global general matrix B. This identifies the first element of the local array B. This subroutine computes the location of the first element of the local subarray used, based on ib, jb, desc_b, p, q, myrow, and mycol; therefore, the leading LOCp(ib+m-1) by LOCq(jb+n-1) part of the local array B must contain the local pieces of the leading ib+m-1 by jb+n-1 part of the global matrix.

Scope: local

Specified as: an LLD_B by (at least) LOCq(N_B) array, containing numbers of the data type indicated in Table 48. Details about the block-cyclic data distribution of global matrix B are stored in desc_b.

ib
is the row index of the global matrix B, identifying the first row of the submatrix B.

Scope: global

Specified as: a fullword integer; 1 <= ib <= M_B and ib+m-1 <= M_B.

jb
is the column index of the global matrix B, identifying the first column of the submatrix B.

Scope: global

Specified as: a fullword integer; 1 <= jb <= N_B and jb+n-1 <= N_B.

desc_b
is the array descriptor for global matrix B, described in the following table:
desc_b Name Description Limits Scope
1 DTYPE_B Descriptor type DTYPE_B=1 Global
2 CTXT_B BLACS context Valid value, as returned by BLACS_GRIDINIT or BLACS_GRIDMAP Global
3 M_B Number of rows in the global matrix If m = 0 or n = 0:
M_B >= 0
Otherwise:
M_B >= 1
Global
4 N_B Number of columns in the global matrix If m = 0 or n = 0:
N_B >= 0
Otherwise:
N_B >= 1
Global
5 MB_B Row block size MB_B >= 1 Global
6 NB_B Column block size NB_B >= 1 Global
7 RSRC_B The process row of the p × q grid over which the first row of the global matrix is distributed 0 <= RSRC_B < p Global
8 CSRC_B The process column of the p × q grid over which the first column of the global matrix is distributed 0 <= CSRC_B < q Global
9 LLD_B The leading dimension of the local array LLD_B >= max(1,LOCp(M_B)) Local

Specified as: an array of (at least) length 9, containing fullword integers.

beta
is the scalar beta.

Scope: global

Specified as: a number of the data type indicated in Table 48.

c
is the local part of the global general matrix C. This identifies the first element of the local array C. This subroutine computes the location of the first element of the local subarray used, based on ic, jc, desc_c, p, q, myrow, and mycol; therefore, the leading LOCp(ic+m-1) by LOCq(jc+n-1) part of the local array C must contain the local pieces of the leading ic+m-1 by jc+n-1 part of the global matrix.

When beta is zero, C need not be set on input.

Scope: local

Specified as: an LLD_C by (at least) LOCq(N_C) array, containing numbers of the data type indicated in Table 48. Details about the block-cyclic data distribution of global matrix C are stored in desc_c.

ic
is the row index of the global matrix C, identifying the first row of the submatrix C.

Scope: global

Specified as: a fullword integer; 1 <= ic <= M_C and ic+m-1 <= M_C.

jc
is the column index of the global matrix C, identifying the first column of the submatrix C.

Scope: global

Specified as: a fullword integer; 1 <= jc <= N_C and jc+n-1 <= N_C.

desc_c
is the array descriptor for global matrix C, described in the following table:
desc_c Name Description Limits Scope
1 DTYPE_C Descriptor type DTYPE_C=1 Global
2 CTXT_C BLACS context Valid value, as returned by BLACS_GRIDINIT or BLACS_GRIDMAP Global
3 M_C Number of rows in the global matrix If m = 0 or n = 0:
M_C >= 0
Otherwise:
M_C >= 1
Global
4 N_C Number of columns in the global matrix If m = 0 or n = 0:
N_C >= 0
Otherwise:
N_C >= 1
Global
5 MB_C Row block size MB_C >= 1 Global
6 NB_C Column block size NB_C >= 1 Global
7 RSRC_C The process row of the p × q grid over which the first row of the global matrix is distributed 0 <= RSRC_C < p Global
8 CSRC_C The process column of the p × q grid over which the first column of the global matrix is distributed 0 <= CSRC_C < q Global
9 LLD_C The leading dimension of the local array LLD_C >= max(1,LOCp(M_C)) Local

Specified as: an array of (at least) length 9, containing fullword integers.

On Return

c
is the updated local part of the global matrix C, containing the results of the computation.

Scope: local

Returned as: an LLD_C by (at least) LOCq(N_C) array, containing numbers of the data type indicated in Table 48.

Notes and Coding Rules
  1. These subroutines accept lowercase letters for the side and uplo arguments.
  2. The matrices must have no common elements; otherwise, results are unpredictable.
  3. The imaginary parts of the diagonal elements of a complex Hermitian matrix A are assumed to be zero, so you do not have to set these values.
  4. The NUMROC utility subroutine can be used to determine the values of LOCp(M_) and LOCq(N_) used in the argument descriptions above. For details, see Determining the Number of Rows and Columns in Your Local Arrays and NUMROC--Compute the Number of Rows or Columns of a Block-Cyclically Distributed Matrix Contained in a Process.
  5. For suggested block sizes, see Coding Tips for Optimizing Parallel Performance.
  6. The following values must be equal: CTXT_A = CTXT_B = CTXT_C.
  7. If side = 'L':
  8. If side = 'R':
  9. If all the following are true:

    then you must follow these rules:

  10. If the following is true:

    or if all the following are true:

    then you must follow these rules:

Error Conditions

Computational Errors

None

Resource Errors

Unable to allocate work space

Input-Argument and Miscellaneous Errors

Stage 1 

  1. DTYPE_A is invalid.
  2. DTYPE_B is invalid.
  3. DTYPE_C is invalid.

Stage 2 

  1. CTXT_A is invalid.

Stage 3 

  1. This subroutine was called from outside the process grid.

Stage 4 

  1. side <> 'L' or 'R'
  2. uplo <> 'U' or 'L'
  3. m < 0
  4. n < 0
  5. M_A < 0 and m = 0 and side = 'L'; M_A < 0 and n = 0 and side = 'R'; M_A < 1 otherwise
  6. N_A < 0 and m = 0 and side = 'L'; N_A < 0 and n = 0 and side = 'R'; N_A < 1 otherwise
  7. MB_A < 1
  8. NB_A < 1
  9. RSRC_A < 0 or RSRC_A >= p
  10. CSRC_A < 0 or CSRC_A >= q
  11. ia < 1
  12. ja < 1
  13. M_B < 0 and (m = 0 or n = 0); M_B < 1 otherwise
  14. N_B < 0 and (m = 0 or n = 0); N_B < 1 otherwise
  15. MB_B < 1
  16. NB_B < 1
  17. RSRC_B < 0 or RSRC_B >= p
  18. CSRC_B < 0 or CSRC_B >= q
  19. ib < 1
  20. jb < 1
  21. M_C < 0 and (m = 0 or n = 0); M_C < 1 otherwise
  22. N_C < 0 and (m = 0 or n = 0); N_C < 1 otherwise
  23. MB_C < 1
  24. NB_C < 1
  25. RSRC_C < 0 or RSRC_C >= p
  26. CSRC_C < 0 or CSRC_C >= q
  27. ic < 1
  28. jc < 1
  29. CTXT_A <> CTXT_B
  30. CTXT_A <> CTXT_C

Stage 5  If (m <> 0 or side <> 'L') and (n <> 0 or side <> 'R'):

  1. ia > M_A
  2. ja > N_A
  3. ia+numa-1 > M_A
  4. ja+numa-1 > N_A

    where numa = m if side = 'L' and numa = n if side = 'R'.

If m <> 0 and n <> 0:

  1. ib > M_B
  2. jb > N_B
  3. ib+m-1 > M_B
  4. jb+n-1 > N_B
  5. ic > M_C
  6. jc > N_C
  7. ic+m-1 > M_C
  8. jc+n-1 > N_C

Stage 6  If A is contained within a single block, that is:

numa+mod(ia-1, MB_A) <= MB_A
numa+mod(ja-1, NB_A) <= NB_A
where:
If side = 'L', numa = m
If side = 'R', numa = n

and:

then:

If A is not contained within a single block, or if A is contained within a single block and:

then:

  1. MB_A <> NB_A
  2. mod(ia-1, MB_A) <> 0
  3. mod(ja-1, NB_A) <> 0

    If side = 'L':

  4. MB_B <> NB_A
  5. MB_C <> NB_A
  6. mod(ib-1, MB_B) <> 0
  7. mod(ic-1, MB_C) <> 0

    If side = 'R':

  8. NB_B <> MB_A
  9. NB_C <> MB_A
  10. mod(jb-1, NB_B) <> 0
  11. mod(jc-1, NB_C) <> 0

In all cases:

  1. LLD_A < max(1, LOCp(M_A))
  2. LLD_B < max(1, LOCp(M_B))
  3. LLD_C < max(1, LOCp(M_C))

    If side = 'L' and looping is required--that is, either of the following is true:

    n+mod(jb-1, NB_B) > NB_B
    n+mod(jc-1, NB_C) > NB_C

    then:

  4. NB_B <> NB_C
  5. mod(jb-1, NB_B) <> mod(jc-1, NB_C).

    If side = 'L':

  6. In the process grid, the process row containing the first row of the submatrix A does not contain the first row of the submatrix B; that is, iarow <> ibrow, where:
    iarow = mod((((ia-1)/MB_A)+RSRC_A), p)
    ibrow = mod((((ib-1)/MB_B)+RSRC_B), p)
  7. In the process grid, the process row containing the first row of the submatrix A does not contain the first row of the submatrix C; that is, iarow <> icrow, where:
    iarow = mod((((ia-1)/MB_A)+RSRC_A), p)
    icrow = mod((((ic-1)/MB_C)+RSRC_C), p)

    If side = 'R' and looping is required--that is, either of the following is true:

    m+mod(ib-1, MB_B) > MB_B
    m+mod(ic-1, MB_C) > MB_C

    then:

  8. MB_B <> MB_C
  9. mod(ib-1, MB_B) <> mod(ic-1, MB_C).

    If side = 'R':

  10. In the process grid, the process column containing the first column of the submatrix A does not contain the first column of the submatrix B; that is, iacol <> ibcol, where:
    iacol = mod((((ja-1)/NB_A)+CSRC_A), q)
    ibcol = mod((((jb-1)/NB_B)+CSRC_B), q)
  11. In the process grid, the process column containing the first column of the submatrix A does not contain the first column of the submatrix C; that is, iacol <> iccol, where:
    iacol = mod((((ja-1)/NB_A)+CSRC_A), q)
    iccol = mod((((jc-1)/NB_C)+CSRC_C), q)

Example 1

This example computes C = betaC+alphaBA using a 2 × 2 process grid.

Call Statements and Input


 ORDER = 'R'
 NPROW = 2
 NPCOL = 2
 CALL BLACS_GET(0, 0, ICONTXT)
 CALL BLACS_GRIDINIT(ICONTXT, ORDER, NPROW, NPCOL)
 CALL BLACS_GRIDINFO(ICONTXT, NPROW, NPCOL, MYROW, MYCOL)
 
              SIDE  UPLO   M   N    ALPHA    A  IA  JA   DESC_A   B  IB   JB
               |     |     |   |      |      |   |   |     |      |   |   |
 CALL PDSYMM( 'R' , 'U' , 16 , 8 ,  1.0D0  , A , 1 , 1 , DESC_A , B , 1 , 1 ,
 
              DESC_B   BETA    C  IC  JC   DESC_C
                |        |     |   |   |     |
              DESC_B , 0.0D0 , C , 1 , 1 , DESC_C )


Desc_A Desc_B Desc_C
DTYPE_ 1 1 1
CTXT_ icontxt(IOBG20) icontxt(IOBG20) icontxt(IOBG20)
M_ 8 16 16
N_ 8 8 8
MB_ 2 4 4
NB_ 2 2 2
RSRC_ 0 0 0
CSRC_ 0 0 0
LLD_ See below(EPSSL20) See below(EPSSL20) See below(EPSSL20)

Notes:

  1. icontxt is the output of the BLACS_GRIDINIT call.

  2. Each process should set the LLD_ as follows:
    LLD_A = MAX(1,NUMROC(M_A, MB_A, MYROW, RSRC_A, NPROW))
    LLD_B = MAX(1,NUMROC(M_B, MB_B, MYROW, RSRC_B, NPROW))
    LLD_C = MAX(1,NUMROC(M_C, MB_C, MYROW, RSRC_C, NPROW))
    

    In this example, LLD_A = 4 on all processes, and LLD_B = LLD_C = 8 on all processes.

Global real symmetric matrix A of order 8 with block size 2 × 2:

B,D        0             1             2             3
     *                                                     *
 0   |  0.0 -1.0  |  -1.0  0.0  |   0.0  0.0  |   0.0  0.0 |
     |   .   1.0  |   0.0  1.0  |   0.0  1.0  |   0.0  1.0 |
     | -----------|-------------|-------------|----------- |
 1   |   .   .    |  -1.0 -1.0  |   0.0  0.0  |   1.0  0.0 |
     |   .   .    |    .  -1.0  |   1.0  1.0  |   0.0  1.0 |
     | -----------|-------------|-------------|----------- |
 2   |   .    .   |    .    .   |  -1.0  0.0  |   0.0  0.0 |
     |   .    .   |    .    .   |    .   1.0  |   0.0  0.0 |
     | -----------|-------------|-------------|----------- |
 3   |   .    .   |    .    .   |    .    .   |   0.0  0.0 |
     |   .    .   |    .    .   |    .    .   |    .   0.0 |
     *                                                     *

The following is the 2 × 2 process grid:

B,D  |   0 2   | 1 3 
-----| ------- |-----
0    |   P00   |  P01
2    |         |
-----| ------- |-----
1    |   P10   |  P11
3    |         |

Local arrays for A:

p,q  |          0           |           1
-----|----------------------|----------------------
     |  0.0 -1.0  0.0  0.0  |  -1.0  0.0  0.0  0.0
     |   .   1.0  0.0  1.0  |   0.0  1.0  0.0  1.0
 0   |   .    .  -1.0  0.0  |    .    .   0.0  0.0
     |   .    .    .   1.0  |    .    .   0.0  0.0
-----|----------------------|----------------------
     |   .    .   0.0  0.0  |  -1.0 -1.0  1.0  0.0
     |   .    .   1.0  1.0  |    .  -1.0  0.0  1.0
 1   |   .    .    .    .   |    .    .   0.0  0.0
     |   .    .    .    .   |    .    .    .   0.0

Global general 16 × 8 matrix B with block size 4 × 2:

B,D        0             1             2             3
     *                                                     *
     | -1.0  0.0  |   1.0 -1.0  |   1.0  1.0  |  -1.0 -1.0 |
     | -1.0 -1.0  |   1.0  0.0  |   1.0 -1.0  |  -1.0  1.0 |
 0   |  1.0  1.0  |  -1.0  0.0  |  -1.0  0.0  |   1.0  0.0 |
     |  0.0 -1.0  |   0.0  0.0  |   0.0  0.0  |   0.0 -1.0 |
     | -----------|-------------|-------------|----------- |
     |  0.0  1.0  |   0.0  1.0  |   0.0  1.0  |   1.0  0.0 |
     |  0.0  0.0  |   1.0  0.0  |  -1.0 -1.0  |   0.0  0.0 |
 1   |  1.0  1.0  |   0.0  0.0  |   1.0  1.0  |   0.0 -1.0 |
     |  0.0  0.0  |  -1.0  0.0  |   0.0  1.0  |   0.0  1.0 |
     | -----------|-------------|-------------|----------- |
     |  0.0  0.0  |   0.0 -1.0  |   1.0  1.0  |   0.0  1.0 |
     | -1.0 -1.0  |   1.0  0.0  |   0.0 -1.0  |   0.0  1.0 |
 2   |  0.0  0.0  |   0.0  1.0  |   1.0  0.0  |   0.0  0.0 |
     |  0.0  0.0  |   1.0  1.0  |   0.0 -1.0  |   0.0  0.0 |
     | -----------|-------------|-------------|----------- |
     |  1.0  1.0  |  -1.0  0.0  |  -1.0 -1.0  |   1.0  1.0 |
     |  0.0  0.0  |   0.0  0.0  |   1.0  0.0  |   0.0 -1.0 |
 3   |  0.0  1.0  |   0.0  0.0  |   0.0  0.0  |   0.0  0.0 |
     | -1.0  0.0  |  -1.0  0.0  |   0.0  1.0  |   1.0  0.0 |
     *                                                     *

The following is the 2 × 2 process grid:

B,D  |   0 2   | 1 3 
-----| ------- |-----
0    |   P00   |  P01
2    |         |
-----| ------- |-----
1    |   P10   |  P11
3    |         |

Local arrays for B:

p,q  |          0           |           1
-----|----------------------|----------------------
     | -1.0  0.0  1.0  1.0  |   1.0 -1.0 -1.0 -1.0
     | -1.0 -1.0  1.0 -1.0  |   1.0  0.0 -1.0  1.0
     |  1.0  1.0 -1.0  0.0  |  -1.0  0.0  1.0  0.0
     |  0.0 -1.0  0.0  0.0  |   0.0  0.0  0.0 -1.0
 0   |  0.0  0.0  1.0  1.0  |   0.0 -1.0  0.0  1.0
     | -1.0 -1.0  0.0 -1.0  |   1.0  0.0  0.0  1.0
     |  0.0  0.0  1.0  0.0  |   0.0  1.0  0.0  0.0
     |  0.0  0.0  0.0 -1.0  |   1.0  1.0  0.0  0.0
-----|----------------------|----------------------
     |  0.0  1.0  0.0  1.0  |   0.0  1.0  1.0  0.0
     |  0.0  0.0 -1.0 -1.0  |   1.0  0.0  0.0  0.0
     |  1.0  1.0  1.0  1.0  |   0.0  0.0  0.0 -1.0
     |  0.0  0.0  0.0  1.0  |  -1.0  0.0  0.0  1.0
 1   |  1.0  1.0 -1.0 -1.0  |  -1.0  0.0  1.0  1.0
     |  0.0  0.0  1.0  0.0  |   0.0  0.0  0.0 -1.0
     |  0.0  1.0  0.0  0.0  |   0.0  0.0  0.0  0.0
     | -1.0  0.0  0.0  1.0  |  -1.0  0.0  1.0  0.0

Output:

Global general 16 × 8 matrix C with block size 4 × 2:

B,D        0             1             2             3
     *                                                     *
     | -1.0  0.0  |   0.0  1.0  |  -2.0  0.0  |   1.0 -1.0 |
     |  0.0  0.0  |  -1.0 -1.0  |  -1.0 -2.0  |   1.0 -1.0 |
 0   |  0.0  0.0  |   1.0  1.0  |   1.0  1.0  |  -1.0  1.0 |
     |  1.0 -2.0  |   0.0 -2.0  |   0.0 -1.0  |   0.0 -1.0 |
     | -----------|-------------|-------------|----------- |
     | -1.0  3.0  |   0.0  1.0  |   1.0  3.0  |   0.0  2.0 |
     | -1.0 -1.0  |  -1.0 -3.0  |   1.0 -1.0  |   1.0  0.0 |
 1   | -1.0  0.0  |  -1.0  2.0  |  -1.0  2.0  |   0.0  1.0 |
     |  1.0  2.0  |   1.0  3.0  |   0.0  1.0  |  -1.0  0.0 |
     | -----------|-------------|-------------|----------- |
     |  0.0  1.0  |   1.0  4.0  |  -2.0  0.0  |   0.0 -1.0 |
     |  0.0  0.0  |   0.0 -2.0  |   0.0 -2.0  |   1.0 -1.0 |
 2   |  0.0  1.0  |  -1.0  0.0  |   0.0  1.0  |   0.0  1.0 |
     | -1.0  0.0  |  -2.0 -3.0  |   1.0  0.0  |   1.0  1.0 |
     | -----------|-------------|-------------|----------- |
     |  0.0  0.0  |   1.0  1.0  |   1.0  0.0  |  -1.0  1.0 |
     |  0.0 -1.0  |   0.0  0.0  |  -1.0  0.0  |   0.0  0.0 |
 3   | -1.0  1.0  |   0.0  1.0  |   0.0  1.0  |   0.0  1.0 |
     |  1.0  2.0  |   3.0  2.0  |   0.0  1.0  |  -1.0  0.0 |
     *                                                     *

The following is the 2 × 2 process grid:

B,D  |   0 2   | 1 3 
-----| ------- |-----
0    |   P00   |  P01
2    |         |
-----| ------- |-----
1    |   P10   |  P11
3    |         |

Local arrays for C:

p,q  |          0           |           1
-----|----------------------|----------------------
     | -1.0  0.0 -2.0  0.0  |   0.0  1.0  1.0 -1.0
     |  0.0  0.0 -1.0 -2.0  |  -1.0 -1.0  1.0 -1.0
     |  0.0  0.0  1.0  1.0  |   1.0  1.0 -1.0  1.0
     |  1.0 -2.0  0.0 -1.0  |   0.0 -2.0  0.0 -1.0
 0   |  0.0  1.0 -2.0  0.0  |   1.0  4.0  0.0 -1.0
     |  0.0  0.0  0.0 -2.0  |   0.0 -2.0  1.0 -1.0
     |  0.0  1.0  0.0  1.0  |  -1.0  0.0  0.0  1.0
     | -1.0  0.0  1.0  0.0  |  -2.0 -3.0  1.0  1.0
-----|----------------------|----------------------
     | -1.0  3.0  1.0  3.0  |   0.0  1.0  0.0  2.0
     | -1.0 -1.0  1.0 -1.0  |  -1.0 -3.0  1.0  0.0
     | -1.0  0.0 -1.0  2.0  |  -1.0  2.0  0.0  1.0
     |  1.0  2.0  0.0  1.0  |   1.0  3.0 -1.0  0.0
 1   |  0.0  0.0  1.0  0.0  |   1.0  1.0 -1.0  1.0
     |  0.0 -1.0 -1.0  0.0  |   0.0  0.0  0.0  0.0
     | -1.0  1.0  0.0  1.0  |   0.0  1.0  0.0  1.0
     |  1.0  2.0  0.0  1.0  |   3.0  2.0 -1.0  0.0

Example 2

This example computes C = betaC+alphaBA using a 2 × 2 process grid.

Call Statements and Input


 ORDER = 'R'
 NPROW = 2
 NPCOL = 2
 CALL BLACS_GET(0, 0, ICONTXT)
 CALL BLACS_GRIDINIT(ICONTXT, ORDER, NPROW, NPCOL)
 CALL BLACS_GRIDINFO(ICONTXT, NPROW, NPCOL, MYROW, MYCOL)
 
              SIDE  UPLO   M   N     ALPHA     A  IA  JA   DESC_A   B  IB   JB
               |     |     |   |       |       |   |   |    |       |   |   |
 CALL PZSYMM( 'R' , 'U' , 16 , 8 ,   ALPHA   , A , 1 , 1 , DESC_A , B , 1 , 1 ,
 
              DESC_B    BETA     C  IC  JC   DESC_C
                |         |      |   |   |    |
              DESC_B ,  BETA   , C , 1 , 1 , DESC_C )
 
              ALPHA  =  (1.0, 2.0)
 
              BETA   =  (0.0, 0.0) 


Desc_A Desc_B Desc_C
DTYPE_ 1 1 1
CTXT_ icontxt(IOBG21) icontxt(IOBG21) icontxt(IOBG21)
M_ 8 16 16
N_ 8 8 8
MB_ 2 4 4
NB_ 2 2 2
RSRC_ 0 0 0
CSRC_ 0 0 0
LLD_ See below(EPSSL21) See below(EPSSL21) See below(EPSSL21)

Notes:

  1. icontxt is the output of the BLACS_GRIDINIT call.

  2. Each process should set the LLD_ as follows:
    LLD_A = MAX(1,NUMROC(M_A, MB_A, MYROW, RSRC_A, NPROW))
    LLD_B = MAX(1,NUMROC(M_B, MB_B, MYROW, RSRC_B, NPROW))
    LLD_C = MAX(1,NUMROC(M_C, MB_C, MYROW, RSRC_C, NPROW))
    

    In this example, LLD_A = 4 on all processes, and LLD_B = LLD_C = 8 on all processes.

Global complex symmetric matrix A of order 8 with block size 2 × 2:



B,D          0                         1                        2                         3
     *                                                                                                      *
 0   | ( 0.0, 1.0) (-1.0, 0.0) | (-1.0, 0.0) ( 0.0, 1.0) |( 0.0, 1.0) ( 0.0, 1.0) | ( 0.0, 1.0) ( 0.0, 1.0) |
     |      .      ( 1.0, 2.0) | ( 0.0, 1.0) ( 1.0, 2.0) |( 0.0, 1.0) ( 1.0, 2.0) | ( 0.0, 1.0) ( 1.0, 2.0) |
     | ------------------------|-------------------------|------------------------|------------------------ |
 1   |      .           .      | (-1.0, 0.0) (-1.0, 0.0) |( 0.0, 1.0) ( 0.0, 1.0) | ( 1.0, 2.0) ( 0.0, 1.0) |
     |      .           .      |      .      (-1.0, 0.0) |( 1.0, 2.0) ( 1.0, 2.0) | ( 0.0, 1.0) ( 1.0, 2.0) |
     | ------------------------|-------------------------|------------------------|------------------------ |
 2   |      .           .      |      .           .      |(-1.0, 0.0) ( 0.0, 1.0) | ( 0.0, 1.0) ( 0.0, 1.0) |
     |      .           .      |      .           .      |     .      ( 1.0, 2.0) | ( 0.0, 1.0) ( 0.0, 1.0) |
     | ------------------------|-------------------------|------------------------|------------------------ |
 3   |      .           .      |      .           .      |     .           .      | ( 0.0, 1.0) ( 0.0, 1.0) |
     |      .           .      |      .           .      |     .           .      |      .      ( 0.0, 1.0) |
     *                                                                                                      *

The following is the 2 × 2 process grid:

B,D  |   0 2   | 1 3 
-----| ------- |-----
0    |   P00   |  P01
2    |         |
-----| ------- |-----
1    |   P10   |  P11
3    |         |

Local arrays for A:



p,q  |                        0                                                 1
-----|-------------------------------------------------|-------------------------------------------------
     | ( 0.0, 1.0) (-1.0, 0.0) ( 0.0, 1.0) ( 0.0, 1.0) | (-1.0, 0.0) ( 0.0, 1.0) ( 0.0, 1.0) ( 0.0, 1.0)
     |      .      ( 1.0, 2.0) ( 0.0, 1.0) ( 1.0, 2.0) | ( 0.0, 1.0) ( 1.0, 2.0) ( 0.0, 1.0) ( 1.0, 2.0)
 0   |      .           .      (-1.0, 0.0) ( 0.0, 1.0) |      .           .      ( 0.0, 1.0) ( 0.0, 1.0)
     |      .           .           .      ( 1.0, 2.0) |      .           .      ( 0.0, 1.0) ( 0.0, 1.0)
-----|-------------------------------------------------|-------------------------------------------------
     |      .           .      ( 0.0, 1.0) ( 0.0, 1.0) | (-1.0, 0.0) (-1.0, 0.0) ( 1.0, 2.0) ( 0.0, 1.0)
     |      .           .      ( 1.0, 2.0) ( 1.0, 2.0) |      .      (-1.0, 0.0) ( 0.0, 1.0) ( 1.0, 2.0)
 1   |      .           .           .           .      |      .           .      ( 0.0, 1.0) ( 0.0, 1.0)
     |      .           .           .           .      |      .           .           .      ( 0.0, 1.0)

Global general 16 × 8 matrix B with block size 4 × 2:



B,D               0                         1                         2                         3
     *                                                                                                       *
     | (-1.0,-3.0) ( 0.0,-2.0) | ( 1.0,-1.0) (-1.0,-3.0) | ( 1.0,-1.0) ( 1.0,-1.0) | (-1.0,-3.0) (-1.0,-3.0) |
     | (-1.0,-3.0) (-1.0,-3.0) | ( 1.0,-1.0) ( 0.0,-2.0) | ( 1.0,-1.0) (-1.0,-3.0) | (-1.0,-3.0) ( 1.0,-1.0) |
 0   | ( 1.0,-1.0) ( 1.0,-1.0) | (-1.0,-3.0) ( 0.0,-2.0) | (-1.0,-3.0) ( 0.0,-2.0) | ( 1.0,-1.0) ( 0.0,-2.0) |
     | ( 0.0,-2.0) (-1.0,-3.0) | ( 0.0,-2.0) ( 0.0,-2.0) | ( 0.0,-2.0) ( 0.0,-2.0) | ( 0.0,-2.0) (-1.0,-3.0) |
     | ------------------------|-------------------------|-------------------------|------------------------ |
     | ( 0.0,-2.0) ( 1.0,-1.0) | ( 0.0,-2.0) ( 1.0,-1.0) | ( 0.0,-2.0) ( 1.0,-1.0) | ( 1.0,-1.0) ( 0.0,-2.0) |
     | ( 0.0,-2.0) ( 0.0,-2.0) | ( 1.0,-1.0) ( 0.0,-2.0) | (-1.0,-3.0) (-1.0,-3.0) | ( 0.0,-2.0) ( 0.0,-2.0) |
 1   | ( 1.0,-1.0) ( 1.0,-1.0) | ( 0.0,-2.0) ( 0.0,-2.0) | ( 1.0,-1.0) ( 1.0,-1.0) | ( 0.0,-2.0) (-1.0,-3.0) |
     | ( 0.0,-2.0) ( 0.0,-2.0) | (-1.0,-3.0) ( 0.0,-2.0) | ( 0.0,-2.0) ( 1.0,-1.0) | ( 0.0,-2.0) ( 1.0,-1.0) |
     | ------------------------|-------------------------|-------------------------|------------------------ |
     | ( 0.0,-2.0) ( 0.0,-2.0) | ( 0.0,-2.0) (-1.0,-3.0) | ( 1.0,-1.0) ( 1.0,-1.0) | ( 0.0,-2.0) ( 1.0,-1.0) |
     | (-1.0,-3.0) (-1.0,-3.0) | ( 1.0,-1.0) ( 0.0,-2.0) | ( 0.0,-2.0) (-1.0,-3.0) | ( 0.0,-2.0) ( 1.0,-1.0) |
 2   | ( 0.0,-2.0) ( 0.0,-2.0) | ( 0.0,-2.0) ( 1.0,-1.0) | ( 1.0,-1.0) ( 0.0,-2.0) | ( 0.0,-2.0) ( 0.0,-2.0) |
     | ( 0.0,-2.0) ( 0.0,-2.0) | ( 1.0,-1.0) ( 1.0,-1.0) | ( 0.0,-2.0) (-1.0,-3.0) | ( 0.0,-2.0) ( 0.0,-2.0) |
     | ------------------------|-------------------------|-------------------------|------------------------ |
     | ( 1.0,-1.0) ( 1.0,-1.0) | (-1.0,-3.0) ( 0.0,-2.0) | (-1.0,-3.0) (-1.0,-3.0) | ( 1.0,-1.0) ( 1.0,-1.0) |
     | ( 0.0,-2.0) ( 0.0,-2.0) | ( 0.0,-2.0) ( 0.0,-2.0) | ( 1.0,-1.0) ( 0.0,-2.0) | ( 0.0,-2.0) (-1.0,-3.0) |
 3   | ( 0.0,-2.0) ( 1.0,-1.0) | ( 0.0,-2.0) ( 0.0,-2.0) | ( 0.0,-2.0) ( 0.0,-2.0) | ( 0.0,-2.0) ( 0.0,-2.0) |
     | (-1.0,-3.0) ( 0.0,-2.0) | (-1.0,-3.0) ( 0.0,-2.0) | ( 0.0,-2.0) ( 1.0,-1.0) | ( 1.0,-1.0) ( 0.0,-2.0) |
     *                                                                                                       *

The following is the 2 × 2 process grid:

B,D  |   0 2   | 1 3 
-----| ------- |-----
0    |   P00   |  P01
2    |         |
-----| ------- |-----
1    |   P10   |  P11
3    |         |

Local arrays for B:



p,q  |                        0                        |                        1
-----|-------------------------------------------------|-------------------------------------------------
     | (-1.0,-3.0) ( 0.0,-2.0) ( 1.0,-1.0) ( 1.0,-1.0) | ( 1.0,-1.0) (-1.0,-3.0) (-1.0,-3.0) (-1.0,-3.0)
     | (-1.0,-3.0) (-1.0,-3.0) ( 1.0,-1.0) (-1.0,-3.0) | ( 1.0,-1.0) ( 0.0,-2.0) (-1.0,-3.0) ( 1.0,-1.0)
     | ( 1.0,-1.0) ( 1.0,-1.0) (-1.0,-3.0) ( 0.0,-2.0) | (-1.0,-3.0) ( 0.0,-2.0) ( 1.0,-1.0) ( 0.0,-2.0)
     | ( 0.0,-2.0) (-1.0,-3.0) ( 0.0,-2.0) ( 0.0,-2.0) | ( 0.0,-2.0) ( 0.0,-2.0) ( 0.0,-2.0) (-1.0,-3.0)
 0   | ( 0.0,-2.0) ( 0.0,-2.0) ( 1.0,-1.0) ( 1.0,-1.0) | ( 0.0,-2.0) (-1.0.-3.0) ( 0.0,-2.0) ( 1.0,-1.0)
     | (-1.0,-3.0) (-1.0,-3.0) ( 0.0,-2.0) (-1.0,-3.0) | ( 1.0,-1.0) ( 0.0,-2.0) ( 0.0,-2.0) ( 1.0,-1.0)
     | ( 0.0,-2.0) ( 0.0,-2.0) ( 1.0,-1.0) ( 0.0,-2.0) | ( 0.0,-2.0) ( 1.0,-1.0) ( 0.0,-2.0) ( 0.0,-2.0)
     | ( 0.0,-2.0) ( 0.0,-2.0) ( 0.0,-2.0) (-1.0,-3.0) | ( 1.0,-1.0) ( 1.0,-1.0) ( 0.0,-2.0) ( 0.0,-2.0)
-----|-------------------------------------------------|-------------------------------------------------
     | ( 0.0,-2.0) ( 1.0,-1.0) ( 0.0,-2.0) ( 1.0,-1.0) | ( 0.0,-2.0) ( 1.0,-1.0) ( 1.0,-1.0) ( 0.0,-2.0)
     | ( 0.0,-2.0) ( 0.0,-2.0) (-1.0,-3.0) (-1.0,-3.0) | ( 1.0,-1.0) ( 0.0,-2.0) ( 0.0,-2.0) ( 0.0,-2.0)
     | ( 1.0,-1.0) ( 1.0,-1.0) ( 1.0,-1.0) ( 1.0,-1.0) | ( 0.0,-2.0) ( 0.0,-2.0) ( 0.0,-2.0) (-1.0,-3.0)
     | ( 0.0,-2.0) ( 0.0,-2.0) ( 0.0,-2.0) ( 1.0,-1.0) | (-1.0,-3.0) ( 0.0,-2.0) ( 0.0,-2.0) ( 1.0,-1.0)
 1   | ( 1.0,-1.0) ( 1.0,-1.0) (-1.0,-3.0) (-1.0,-3.0) | (-1.0,-3.0) ( 0.0,-2.0) ( 1.0,-1.0) ( 1.0,-1.0)
     | ( 0.0,-2.0) ( 0.0,-2.0) ( 1.0,-1.0) ( 0.0,-2.0) | ( 0.0,-2.0) ( 0.0,-2.0) ( 0.0,-2.0) (-1.0,-3.0)
     | ( 0.0,-2.0) ( 1.0,-1.0) ( 0.0,-2.0) ( 0.0,-2.0) | ( 0.0,-2.0) ( 0.0,-2.0) ( 0.0,-2.0) ( 0.0,-2.0)
     | (-1.0,-3.0) ( 0.0,-2.0) ( 0.0,-2.0) ( 1.0,-1.0) | (-1.0,-3.0) ( 0.0,-2.0) ( 1.0,-1.0) ( 0.0,-2.0)

Output:

Global general 16 × 8 matrix C with block size 4 × 2:



B,D               0                         1                         2                         3
     *                                                                                                       *
     | (11.0,27.0) (37.0,39.0) | ( 7.0,29.0) (27.0,39.0) | (27.0,29.0) (37.0,39.0) | (21.0,37.0) (35.0,35.0) |
     | ( 7.0,29.0) (37.0,39.0) | (11.0,27.0) (35.0,35.0) | (23.0,31.0) (45.0,35.0) | (21.0,37.0) (35.0,35.0) |
 0   | ( 1.0,27.0) (31.0,37.0) | (-3.0,29.0) (21.0,37.0) | ( 9.0,33.0) (27.0,39.0) | (23.0,31.0) (21.0,37.0) |
     | ( 6.0,32.0) (48.0,36.0) | (10.0,30.0) (42.0,34.0) | (22.0,34.0) (44.0,38.0) | (28.0,36.0) (38.0,36.0) |
     | ------------------------|-------------------------|-------------------------|------------------------ |
     | (-4.0,22.0) (10.0,40.0) | (-8.0,24.0) (12.0,34.0) | ( 0.0,30.0) (10.0,40.0) | (10.0,30.0) ( 8.0,36.0) |
     | (11.0,27.0) (41.0,37.0) | (11.0,27.0) (43.0,31.0) | (15.0,35.0) (41.0,37.0) | (21.0,37.0) (31.0,37.0) |
 1   | (-1.0,23.0) (25.0,35.0) | (-1.0,23.0) (11.0,37.0) | (11.0,27.0) (17.0,39.0) | (13.0,31.0) (15.0,35.0) |
     | (-3.0,29.0) (23.0,41.0) | (-3.0,29.0) (13.0,41.0) | (13.0,31.0) (27.0,39.0) | (23.0,31.0) (25.0,35.0) |
     | ------------------------|-------------------------|-------------------------|------------------------ |
     | (-2.0,26.0) (24.0,38.0) | (-6.0,28.0) ( 6.0,42.0) | (18.0,26.0) (28.0,36.0) | (16.0,32.0) (26.0,32.0) |
     | ( 7.0,29.0) (37.0,39.0) | ( 7.0,29.0) (39.0,33.0) | (19.0,33.0) (45.0,35.0) | (21.0,37.0) (35.0,35.0) |
 2   | (-2.0,26.0) (24.0,38.0) | ( 2.0,24.0) (22.0,34.0) | (10.0,30.0) (24.0,38.0) | (16.0,32.0) (18.0,36.0) |
     | ( 5.0,25.0) (31.0,37.0) | ( 9.0,23.0) (37.0,29.0) | ( 9.0,33.0) (31.0,37.0) | (15.0,35.0) (21.0,37.0) |
     | ------------------------|-------------------------|-------------------------|------------------------ |
     | ( 1.0,27.0) (31.0,37.0) | (-3.0,29.0) (21.0,37.0) | ( 9.0,33.0) (31.0,37.0) | (23.0,31.0) (21.0,37.0) |
     | ( 4.0,28.0) (38.0,36.0) | ( 4.0,28.0) (28.0,36.0) | (20.0,30.0) (34.0,38.0) | (22.0,34.0) (28.0,36.0) |
 3   | ( 5.0,25.0) (27.0,39.0) | ( 1.0,27.0) (21.0,37.0) | (13.0,31.0) (27.0,39.0) | (19.0,33.0) (21.0,37.0) |
     | ( 0.0,30.0) (26.0,42.0) | (-8.0,34.0) (20.0,40.0) | (16.0,32.0) (30.0,40.0) | (26.0,32.0) (28.0,36.0) |
     *                                                                                                       *

The following is the 2 × 2 process grid:

B,D  |   0 2   | 1 3 
-----| ------- |-----
0    |   P00   |  P01
2    |         |
-----| ------- |-----
1    |   P10   |  P11
3    |         |

Local arrays for C:



p,q  |                        0                        |                        1
-----|-------------------------------------------------|-------------------------------------------------
     | (11.0,27.0) (37.0,39.0) (27.0,29.0) (37.0,39.0) | ( 7.0,29.0) (27.0,39.0) (21.0,37.0) (35.0,35.0)
     | ( 7.0,29.0) (37.0,39.0) (23.0,31.0) (45.0,35.0) | (11.0,27.0) (35.0,35.0) (21.0,37.0) (35.0,35.0)
     | ( 1.0,27.0) (31.0,37.0) ( 9.0,33.0) (27.0,39.0) | (-3.0,29.0) (21.0,37.0) (23.0,31.0) (21.0,37.0)
     | ( 6.0,32.0) (48.0,36.0) (22.0,34.0) (44.0,38.0) | (10.0,30.0) (42.0,34.0) (28.0,36.0) (38.0,36.0)
 0   | (-2.0,26.0) (24.0,38.0) (18.0,26.0) (28.0,36.0) | (-6.0,28.0) ( 6.0,42.0) (16.0,32.0) (26.0,32.0)
     | ( 7.0,29.0) (37.0,39.0) (19.0,33.0) (45.0,35.0) | ( 7.0,29.0) (39.0,33.0) (21.0,37.0) (35.0,35.0)
     | (-2.0,26.0) (24.0,38.0) (10.0,30.0) (24.0,38.0) | ( 2.0,24.0) (22.0,34.0) (16.0,32.0) (18.0,36.0)
     | ( 5.0,25.0) (31.0,37.0) ( 9.0,33.0) (31.0,37.0) | ( 9.0,23.0) (37.0,29.0) (15.0,35.0) (21.0,37.0)
-----|-------------------------------------------------|-------------------------------------------------
     | (-4.0,22.0) (10.0,40.0) ( 0.0,30.0) (10.0,40.0) | (-8.0,24.0) (12.0,34.0) (10.0,30.0) ( 8.0,36.0)
     | (11.0,27.0) (41.0,37.0) (15.0,35.0) (41.0,37.0) | (11.0,27.0) (43.0,31.0) (21.0,37.0) (31.0,37.0)
     | (-1.0,23.0) (25.0,35.0) (11.0,27.0) (17.0,39.0) | (-1.0,23.0) (11.0,37.0) (13.0,31.0) (15.0,35.0)
     | (-3.0,29.0) (23.0,41.0) (13.0,31.0) (27.0,39.0) | (-3.0,29.0) (13.0,41.0) (23.0,31.0) (25.0,35.0)
 1   | ( 1.0,27.0) (31.0,37.0) ( 9.0,33.0) (31.0,37.0) | (-3.0,29.0) (21.0,37.0) (23.0,31.0) (21.0,37.0)
     | ( 4.0,28.0) (38.0,36.0) (20.0,30.0) (34.0,38.0) | ( 4.0,28.0) (28.0,36.0) (22.0,34.0) (28.0,36.0)
     | ( 5.0,25.0) (27.0,39.0) (13.0,31.0) (27.0,39.0) | ( 1.0,27.0) (21.0,37.0) (19.0,33.0) (21.0,37.0)
     | ( 0.0,30.0) (26.0,42.0) (16.0,32.0) (30.0,40.0) | (-8.0,34.0) (20.0,40.0) (26.0,32.0) (28.0,36.0)

Example 3

This example computes C = betaC+alphaBA using a 2 × 2 process grid.

Note:
The imaginary parts of the diagonal elements of a complex Hermitian matrix are assumed to be zero, so you do not have to set these values.

Call Statements and Input


 ORDER = 'R'
 NPROW = 2
 NPCOL = 2
 CALL BLACS_GET(0, 0, ICONTXT)
 CALL BLACS_GRIDINIT(ICONTXT, ORDER, NPROW, NPCOL)
 CALL BLACS_GRIDINFO(ICONTXT, NPROW, NPCOL, MYROW, MYCOL)
 
              SIDE  UPLO   M   N     ALPHA     A  IA  JA   DESC_A   B  IB   JB
               |     |     |   |       |       |   |   |    |       |   |   |
 CALL PZHEMM( 'R' , 'U' , 16 , 8 ,   ALPHA   , A , 1 , 1 , DESC_A , B , 1 , 1 ,
 
              DESC_B    BETA     C  IC  JC   DESC_C
                |         |      |   |   |    |
              DESC_B ,  BETA   , C , 1 , 1 , DESC_C )
 
              ALPHA  =  (1.0, 0.0)
 
              BETA   =  (0.0, 0.0) 


Desc_A Desc_B Desc_C
DTYPE_ 1 1 1
CTXT_ icontxt(IOBG22) icontxt(IOBG22) icontxt(IOBG22)
M_ 8 16 16
N_ 8 8 8
MB_ 2 4 4
NB_ 2 2 2
RSRC_ 0 0 0
CSRC_ 0 0 0
LLD_ See below(EPSSL22) See below(EPSSL22) See below(EPSSL22)

Notes:

  1. icontxt is the output of the BLACS_GRIDINIT call.

  2. Each process should set the LLD_ as follows:
    LLD_A = MAX(1,NUMROC(M_A, MB_A, MYROW, RSRC_A, NPROW))
    LLD_B = MAX(1,NUMROC(M_B, MB_B, MYROW, RSRC_B, NPROW))
    LLD_C = MAX(1,NUMROC(M_C, MB_C, MYROW, RSRC_C, NPROW))
    

    In this example, LLD_A = 4 on all processes, and LLD_B = LLD_C = 8 on all processes.

Global Hermitian matrix A of order 8 with block size 2 × 2:



B,D               0                         1                        2                         3
     *                                                                                                      *
 0   | ( 0.0, 0.0) (-1.0, 0.0) | (-1.0, 0.0) ( 0.0, 1.0) |( 0.0, 1.0) ( 0.0, 1.0) | ( 0.0, 1.0) ( 0.0, 1.0) |
     |      .      ( 1.0, 0.0) | ( 0.0, 1.0) ( 1.0, 2.0) |( 0.0, 1.0) ( 1.0, 2.0) | ( 0.0, 1.0) ( 1.0, 2.0) |
     | ------------------------|-------------------------|------------------------|------------------------ |
 1   |      .           .      | (-1.0, 0.0) (-1.0, 0.0) |( 0.0, 1.0) ( 0.0, 1.0) | ( 1.0, 2.0) ( 0.0, 1.0) |
     |      .           .      |      .      (-1.0, 0.0) |( 1.0, 2.0) ( 1.0, 2.0) | ( 0.0, 1.0) ( 1.0, 2.0) |
     | ------------------------|-------------------------|------------------------|------------------------ |
 2   |      .           .      |      .           .      |(-1.0, 0.0) ( 0.0, 1.0) | ( 0.0, 1.0) ( 0.0, 1.0) |
     |      .           .      |      .           .      |     .      ( 1.0, 0.0) | ( 0.0, 1.0) ( 0.0, 1.0) |
     | ------------------------|-------------------------|------------------------|------------------------ |
 3   |      .           .      |      .           .      |     .           .      | ( 0.0, 0.0) ( 0.0, 1.0) |
     |      .           .      |      .           .      |     .           .      |      .      ( 0.0, 0.0) |
     *                                                                                                      *

The following is the 2 × 2 process grid:

B,D  |   0 2   | 1 3 
-----| ------- |-----
0    |   P00   |  P01
2    |         |
-----| ------- |-----
1    |   P10   |  P11
3    |         |

Local arrays for A:



p,q  |                        0                        |                        1
-----|-------------------------------------------------|-------------------------------------------------
     | ( 0.0,  . ) (-1.0, 0.0) ( 0.0, 1.0) ( 0.0, 1.0) | (-1.0, 0.0) ( 0.0, 1.0) ( 0.0, 1.0) ( 0.0, 1.0)
     |      .      ( 1.0,  . ) ( 0.0, 1.0) ( 1.0, 2.0) | ( 0.0, 1.0) ( 1.0, 2.0) ( 0.0, 1.0) ( 1.0, 2.0)
 0   |      .           .      (-1.0,  . ) ( 0.0, 1.0) |      .           .      ( 0.0, 1.0) ( 0.0, 1.0)
     |      .           .           .      ( 1.0,  . ) |      .           .      ( 0.0, 1.0) ( 0.0, 1.0)
-----|-------------------------------------------------|-------------------------------------------------
     |      .           .      ( 0.0, 1.0) ( 0.0, 1.0) | (-1.0,  . ) (-1.0, 0.0) ( 1.0, 2.0) ( 0.0, 1.0)
     |      .           .      ( 1.0, 2.0) ( 1.0, 2.0) |      .      (-1.0,  . ) ( 0.0, 1.0) ( 1.0, 2.0)
 1   |      .           .           .           .      |      .           .      ( 0.0,  . ) ( 0.0, 1.0)
     |      .           .           .           .      |      .           .           .      ( 0.0,  . )

Global general 16 × 8 matrix B with block size 4 × 2:



B,D               0                         1                         2                         3
     *                                                                                                       *
     | (-1.0,-3.0) ( 0.0,-2.0) | ( 1.0,-1.0) (-1.0,-3.0) | ( 1.0,-1.0) ( 1.0,-1.0) | (-1.0,-3.0) (-1.0,-3.0) |
     | (-1.0,-3.0) (-1.0,-3.0) | ( 1.0,-1.0) ( 0.0,-2.0) | ( 1.0,-1.0) (-1.0,-3.0) | (-1.0,-3.0) ( 1.0,-1.0) |
 0   | ( 1.0,-1.0) ( 1.0,-1.0) | (-1.0,-3.0) ( 0.0,-2.0) | (-1.0,-3.0) ( 0.0,-2.0) | ( 1.0,-1.0) ( 0.0,-2.0) |
     | ( 0.0,-2.0) (-1.0,-3.0) | ( 0.0,-2.0) ( 0.0,-2.0) | ( 0.0,-2.0) ( 0.0,-2.0) | ( 0.0,-2.0) (-1.0,-3.0) |
     | ------------------------|-------------------------|-------------------------|------------------------ |
     | ( 0.0,-2.0) ( 1.0,-1.0) | ( 0.0,-2.0) ( 1.0,-1.0) | ( 0.0,-2.0) ( 1.0,-1.0) | ( 1.0,-1.0) ( 0.0,-2.0) |
     | ( 0.0,-2.0) ( 0.0,-2.0) | ( 1.0,-1.0) ( 0.0,-2.0) | (-1.0,-3.0) (-1.0,-3.0) | ( 0.0,-2.0) ( 0.0,-2.0) |
 1   | ( 1.0,-1.0) ( 1.0,-1.0) | ( 0.0,-2.0) ( 0.0,-2.0) | ( 1.0,-1.0) ( 1.0,-1.0) | ( 0.0,-2.0) (-1.0,-3.0) |
     | ( 0.0,-2.0) ( 0.0,-2.0) | (-1.0,-3.0) ( 0.0,-2.0) | ( 0.0,-2.0) ( 1.0,-1.0) | ( 0.0,-2.0) ( 1.0,-1.0) |
     | ------------------------|-------------------------|-------------------------|------------------------ |
     | ( 0.0,-2.0) ( 0.0,-2.0) | ( 0.0,-2.0) (-1.0,-3.0) | ( 1.0,-1.0) ( 1.0,-1.0) | ( 0.0,-2.0) ( 1.0,-1.0) |
     | (-1.0,-3.0) (-1.0,-3.0) | ( 1.0,-1.0) ( 0.0,-2.0) | ( 0.0,-2.0) (-1.0,-3.0) | ( 0.0,-2.0) ( 1.0,-1.0) |
 2   | ( 0.0,-2.0) ( 0.0,-2.0) | ( 0.0,-2.0) ( 1.0,-1.0) | ( 1.0,-1.0) ( 0.0,-2.0) | ( 0.0,-2.0) ( 0.0,-2.0) |
     | ( 0.0,-2.0) ( 0.0,-2.0) | ( 1.0,-1.0) ( 1.0,-1.0) | ( 0.0,-2.0) (-1.0,-3.0) | ( 0.0,-2.0) ( 0.0,-2.0) |
     | ------------------------|-------------------------|-------------------------|------------------------ |
     | ( 1.0,-1.0) ( 1.0,-1.0) | (-1.0,-3.0) ( 0.0,-2.0) | (-1.0,-3.0) (-1.0,-3.0) | ( 1.0,-1.0) ( 1.0,-1.0) |
     | ( 0.0,-2.0) ( 0.0,-2.0) | ( 0.0,-2.0) ( 0.0,-2.0) | ( 1.0,-1.0) ( 0.0,-2.0) | ( 0.0,-2.0) (-1.0,-3.0) |
 3   | ( 0.0,-2.0) ( 1.0,-1.0) | ( 0.0,-2.0) ( 0.0,-2.0) | ( 0.0,-2.0) ( 0.0,-2.0) | ( 0.0,-2.0) ( 0.0,-2.0) |
     | (-1.0,-3.0) ( 0.0,-2.0) | (-1.0,-3.0) ( 0.0,-2.0) | ( 0.0,-2.0) ( 1.0,-1.0) | ( 1.0,-1.0) ( 0.0,-2.0) |
     *                                                                                                       *

The following is the 2 × 2 process grid:

B,D  |   0 2   | 1 3 
-----| ------- |-----
0    |   P00   |  P01
2    |         |
-----| ------- |-----
1    |   P10   |  P11
3    |         |

Local arrays for B:



p,q  |                        0                        |                        1
-----|-------------------------------------------------|-------------------------------------------------
     | (-1.0,-3.0) ( 0.0,-2.0) ( 1.0,-1.0) ( 1.0,-1.0) | ( 1.0,-1.0) (-1.0,-3.0) (-1.0,-3.0) (-1.0,-3.0)
     | (-1.0,-3.0) (-1.0,-3.0) ( 1.0,-1.0) (-1.0,-3.0) | ( 1.0,-1.0) ( 0.0,-2.0) (-1.0,-3.0) ( 1.0,-1.0)
     | ( 1.0,-1.0) ( 1.0,-1.0) (-1.0,-3.0) ( 0.0,-2.0) | (-1.0,-3.0) ( 0.0,-2.0) ( 1.0,-1.0) ( 0.0,-2.0)
     | ( 0.0,-2.0) (-1.0,-3.0) ( 0.0,-2.0) ( 0.0,-2.0) | ( 0.0,-2.0) ( 0.0,-2.0) ( 0.0,-2.0) (-1.0,-3.0)
 0   | ( 0.0,-2.0) ( 0.0,-2.0) ( 1.0,-1.0) ( 1.0,-1.0) | ( 0.0,-2.0) (-1.0,-3.0) ( 0.0,-2.0) ( 1.0,-1.0)
     | (-1.0,-3.0) (-1.0,-3.0) ( 0.0,-2.0) (-1.0,-3.0) | ( 1.0,-1.0) ( 0.0,-2.0) ( 0.0,-2.0) ( 1.0,-1.0)
     | ( 0.0,-2.0) ( 0.0,-2.0) ( 1.0,-1.0) ( 0.0,-2.0) | ( 0.0,-2.0) ( 1.0,-1.0) ( 0.0,-2.0) ( 0.0,-2.0)
     | ( 0.0,-2.0) ( 0.0,-2.0) ( 0.0,-2.0) (-1.0,-3.0) | ( 1.0,-1.0) ( 1.0,-1.0) ( 0.0,-2.0) ( 0.0,-2.0)
-----|-------------------------------------------------|-------------------------------------------------
     | ( 0.0,-2.0) ( 1.0,-1.0) ( 0.0,-2.0) ( 1.0,-1.0) | ( 0.0,-2.0) ( 1.0,-1.0) ( 1.0,-1.0) ( 0.0,-2.0)
     | ( 0.0,-2.0) ( 0.0,-2.0) (-1.0,-3.0) (-1.0,-3.0) | ( 1.0,-1.0) ( 0.0,-2.0) ( 0.0,-2.0) ( 0.0,-2.0)
     | ( 1.0,-1.0) ( 1.0,-1.0) ( 1.0,-1.0) ( 1.0,-1.0) | ( 0.0,-2.0) ( 0.0,-2.0) ( 0.0,-2.0) (-1.0,-3.0)
     | ( 0.0,-2.0) ( 0.0,-2.0) ( 0.0,-2.0) ( 1.0,-1.0) | (-1.0,-3.0) ( 0.0,-2.0) ( 0.0,-2.0) ( 1.0,-1.0)
 1   | ( 1.0,-1.0) ( 1.0,-1.0) (-1.0,-3.0) (-1.0,-3.0) | (-1.0,-3.0) ( 0.0,-2.0) ( 1.0,-1.0) ( 1.0,-1.0)
     | ( 0.0,-2.0) ( 0.0,-2.0) ( 1.0,-1.0) ( 0.0,-2.0) | ( 0.0,-2.0) ( 0.0,-2.0) ( 0.0,-2.0) (-1.0,-3.0)
     | ( 0.0,-2.0) ( 1.0,-1.0) ( 0.0,-2.0) ( 0.0,-2.0) | ( 0.0,-2.0) ( 0.0,-2.0) ( 0.0,-2.0) ( 0.0,-2.0)
     | (-1.0,-3.0) ( 0.0,-2.0) ( 0.0,-2.0) ( 1.0,-1.0) | (-1.0,-3.0) ( 0.0,-2.0) ( 1.0,-1.0) ( 0.0,-2.0)

Output:

Global general 16 × 8 matrix C with block size 4 × 2:



B,D               0                          1                         2                         3
     *                                                                                                        *
     | (-12.0,4.0) (-19.0,-5.0) | (-9.0, 5.0) (-5.0,-5.0) | ( 3.0,-3.0) ( 9.0,-5.0) | (10.0, 2.0) (18.0,-6.0) |
     | (-10.0,4.0) (-17.0,-7.0) | (-9.0, 3.0) (-5.0,-9.0) | ( 3.0,-1.0) ( 9.0,-9.0) | (14.0,-2.0) (20.0,-8.0) |
 0   | (-10.0,4.0) (-19.0,-5.0) | (-7.0, 5.0) (-11.0,1.0) | ( 5.0, 1.0) (11.0,-5.0) | (12.0,-4.0) (17.0,-1.0) |
     | (-10.0,6.0) (-22.0,-6.0) | (-8.0, 4.0) (-10.0,-6.0)| ( 4.0, 0.0) (10.0,-8.0) | (12.0,-2.0) (19.0,-7.0) |
     | -------------------------|-------------------------|-------------------------|------------------------ |
     | (-8.0, 0.0) (-10.0,-8.0) | (-6.0, 2.0) (-6.0,-4.0) | ( 4.0, 2.0) (10.0, 0.0) | ( 9.0, 1.0) (14.0, 4.0) |
     | (-13.0,5.0) (-21.0,-5.0) | (-11.0,5.0) (-15.0,-3.0)| ( 3.0, 3.0) ( 9.0,-7.0) | (13.0,-1.0) (19.0,-5.0) |
 1   | (-10.0,2.0) (-17.0,-7.0) | (-9.0, 3.0) (-7.0,-1.0) | ( 1.0, 1.0) ( 7.0, 1.0) | ( 7.0, 3.0) (14.0, 2.0) |
     | (-7.0, 3.0) (-13.0,-7.0) | (-5.0, 3.0) (-1.0,-5.0) | ( 7.0,-3.0) (13.0,-7.0) | (13.0,-5.0) (18.0,-4.0) |
     | -------------------------|-------------------------|-------------------------|------------------------ |
     | (-8.0, 2.0) (-14.0,-8.0) | (-4.0, 2.0) ( 2.0,-6.0) | ( 6.0,-6.0) (12.0,-8.0) | (12.0,-2.0) (17.0,-5.0) |
     | (-10.0,4.0) (-17.0,-7.0) | (-7.0, 3.0) (-7.0,-9.0) | ( 5.0,-1.0) (11.0,-11.0)| (15.0,-3.0) (20.0,-8.0) |
 2   | (-8.0, 2.0) (-14.0,-8.0) | (-8.0, 2.0) (-6.0,-6.0) | ( 2.0, 2.0) ( 8.0,-2.0) | (10.0, 0.0) (16.0, 0.0) |
     | (-11.0,3.0) (-17.0,-7.0) | (-11.0,3.0) (-13.0,-5.0)| ( 1.0, 5.0) ( 7.0,-3.0) | (11.0, 1.0) (17.0,-1.0) |
     | -------------------------|-------------------------|-------------------------|------------------------ |
     | (-10.0,4.0) (-19.0,-5.0) | (-7.0, 5.0) (-11.0,1.0) | ( 5.0, 1.0) (11.0,-7.0) | (14.0,-6.0) (18.0,-2.0) |
     | (-10.0,4.0) (-20.0,-6.0) | (-8.0, 4.0) (-8.0,-4.0) | ( 2.0, 0.0) ( 8.0,-4.0) | (10.0, 0.0) (17.0,-3.0) |
 3   | (-11.0,3.0) (-17.0,-5.0) | (-9.0, 5.0) (-9.0,-1.0) | ( 3.0, 1.0) ( 9.0,-3.0) | (11.0,-1.0) (17.0,-1.0) |
     | (-7.0, 3.0) (-14.0,-6.0) | (-2.0, 4.0) (-2.0,-6.0) | ( 8.0,-4.0) (14.0,-8.0) | (13.0,-5.0) (18.0,-4.0) |
     *                                                                                                        *

The following is the 2 × 2 process grid:

B,D  |   0 2   | 1 3 
-----| ------- |-----
0    |   P00   |  P01
2    |         |
-----| ------- |-----
1    |   P10   |  P113    |         |

Local arrays for C:


p,q  |                            0                            |                            1
-----|---------------------------------------------------------|---------------------------------------------------------
     | (-12.0,  4.0) (-19.0, -5.0) (  3.0, -3.0) (  9.0, -5.0) | ( -9.0,  5.0) ( -5.0, -5.0) ( 10.0,  2.0) ( 18.0, -6.0)
     | (-10.0,  4.0) (-17.0, -7.0) (  3.0, -1.0) (  9.0, -9.0) | ( -9.0,  3.0) ( -5.0, -9.0) ( 14.0, -2.0) ( 20.0, -8.0)
     | (-10.0,  4.0) (-19.0, -5.0) (  5.0,  1.0) ( 11.0, -5.0) | ( -7.0,  5.0) (-11.0,  1.0) ( 12.0, -4.0) ( 17.0, -1.0)
     | (-10.0,  6.0) (-22.0, -6.0) (  4.0,  0.0) ( 10.0, -8.0) | ( -8.0,  4.0) (-10.0, -6.0) ( 12.0, -2.0) ( 19.0, -7.0)
 0   | ( -8.0,  2.0) (-14.0, -8.0) (  6.0, -6.0) ( 12.0, -8.0) | ( -4.0,  2.0) (  2.0, -6.0) ( 12.0, -2.0) ( 17.0, -5.0)
     | (-10.0,  4.0) (-17.0, -7.0) (  5.0, -1.0) ( 11.0,-11.0) | ( -7.0,  3.0) ( -7.0, -9.0) ( 15.0, -3.0) ( 20.0, -8.0)
     | ( -8.0,  2.0) (-14.0, -8.0) (  2.0,  2.0) (  8.0, -2.0) | ( -8.0,  2.0) ( -6.0, -6.0) ( 10.0,  0.0) ( 16.0,  0.0)
     | (-11.0,  3.0) (-17.0, -7.0) (  1.0,  5.0) (  7.0, -3.0) | (-11.0,  3.0) (-13.0, -5.0) ( 11.0,  1.0) ( 17.0, -1.0)
-----|---------------------------------------------------------|---------------------------------------------------------
     | ( -8.0,  0.0) (-10.0, -8.0) (  4.0,  2.0) ( 10.0,  0.0) | ( -6.0,  2.0) ( -6.0, -4.0) (  9.0,  1.0) ( 14.0,  4.0)
     | (-13.0,  5.0) (-21.0, -5.0) (  3.0,  3.0) (  9.0, -7.0) | (-11.0,  5.0) (-15.0, -3.0) ( 13.0, -1.0) ( 19.0, -5.0)
     | (-10.0,  2.0) (-17.0, -7.0) (  1.0,  1.0) (  7.0,  1.0) | ( -9.0,  3.0) ( -7.0, -1.0) (  7.0,  3.0) ( 14.0,  2.0)
     | ( -7.0,  3.0) (-13.0, -7.0) (  7.0, -3.0) ( 13.0, -7.0) | ( -5.0,  3.0) ( -1.0, -5.0) ( 13.0, -5.0) ( 18.0, -4.0)
 1   | (-10.0,  4.0) (-19.0, -5.0) (  5.0,  1.0) ( 11.0, -7.0) | ( -7.0,  5.0) (-11.0,  1.0) ( 14.0, -6.0) ( 18.0, -2.0)
     | (-10.0,  4.0) (-20.0, -6.0) (  2.0,  0.0) (  8.0, -4.0) | ( -8.0,  4.0) ( -8.0, -4.0) ( 10.0,  0.0) ( 17.0, -3.0)
     | (-11.0,  3.0) (-17.0, -5.0) (  3.0,  1.0) (  9.0, -3.0) | ( -9.0,  5.0) ( -9.0, -1.0) ( 11.0, -1.0) ( 17.0, -1.0)
     | ( -7.0,  3.0) (-14.0, -6.0) (  8.0, -4.0) ( 14.0, -8.0) | ( -2.0,  4.0) ( -2.0, -6.0) ( 13.0, -5.0) ( 18.0, -4.0)


[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]