PDGTTRF factors the general tridiagonal matrix A, stored in tridiagonal storage mode, using Gaussian elimination with partial pivoting.
PDDTTRF factors the diagonally dominant general tridiagonal matrix A, stored in tridiagonal storage mode, using Gaussian elimination.
In these subroutine descriptions, A represents the global square general tridiagonal submatrix Aia:ia+n-1, ia:ia+n-1.
To solve a tridiagonal system of linear equations with multiple right-hand sides, follow the call to PDGTTRF or PDDTTRF with one or more calls to PDGTTRS or PDDTTRS, respectively. The output from these factorization subroutines should be used only as input to the solve subroutines PDGTTRS and PDDTTRS, respectively.
If n = 0, no computation is performed and the subroutine
returns after doing some parameter checking. See reference [51].
dl, d, du, du2, af, work | ipiv | Subroutine |
Long-precision real | Integer | PDGTTRF and PDDTTRF |
Fortran | CALL PDGTTRF (n, dl, d, du,
du2, ia, desc_a, ipiv, af,
laf, work, lwork, info)
CALL PDDTTRF (n, dl, d, du, ia, desc_a, af, laf, work, lwork, info) |
C and C++ | pdgttrf (n, dl, d, du, du2,
ia, desc_a, ipiv, af, laf,
work, lwork, info);
pddttrf (n, dl, d, du, ia, desc_a, af, laf, work, lwork, info); |
Scope: global
Specified as: a fullword integer, where:
where p is the number of processes in a process grid.
The global vector dl contains the subdiagonal of the global general tridiagonal submatrix A in elements ia+1 through ia+n-1.
Scope: local
Specified as: a one-dimensional array of (at least) length LOCp(ia+n-1), containing numbers of the data type indicated in Table 76. Details about block-cyclic data distribution of global matrix A are stored in desc_a.
On output, DL is overwritten; that is, the original input is not preserved.
The global vector d contains the main diagonal of the global general tridiagonal submatrix A in elements ia through ia+n-1.
Scope: local
Specified as: a one-dimensional array of (at least) length LOCp(ia+n-1). containing numbers of the data type indicated in Table 76. Details about block-cyclic data distribution of global matrix A are stored in desc_a.
On output, D is overwritten; that is, the original input is not preserved.
The global vector du contains the superdiagonal of the global general tridiagonal submatrix A in elements ia through ia+n-2.
Scope: local
Specified as: a one-dimensional array of (at least) length LOCp(ia+n-1), containing numbers of the data type indicated in Table 76. Details about block-cyclic data distribution of global matrix A are stored in desc_a.
On output, DU is overwritten; that is, the original input is not preserved.
Scope: global
Specified as: a fullword integer, where:
Table 77. Type-502 Array Descriptor
desc_a | Name | Description | Limits | Scope |
---|---|---|---|---|
1 | DTYPE_A | Descriptor Type | DTYPE_A=502 for p × 1 or
1 × p
where p is the number of processes in a process grid. | Global |
2 | CTXT_A | BLACS context | Valid value, as returned by BLACS_GRIDINIT or BLACS_GRIDMAP | Global |
3 | M_A | Number of rows in the global matrix |
If n = 0: M_A >= 0 Otherwise: M_A >= 1 | Global |
4 | MB_A | Row block size | MB_A >= 1 and 0 <= n <= (MB_A)(p)-mod(ia-1,MB_A) | Global |
5 | RSRC_A | The process row over which the first row of the global matrix is distributed | 0 <= RSRC_A < p | Global |
6 | -- | Not used by these subroutines. | -- | -- |
7 | -- | Reserved | -- | -- |
Specified as: an array of (at least) length 7, containing fullword
integers.
Table 78. Type-1 Array Descriptor (p × 1 Process Grid)
desc_a | Name | Description | Limits | Scope |
---|---|---|---|---|
1 | DTYPE_A | Descriptor Type | DTYPE_A = 1 for p × 1
where p is the number of processes in a process grid. | Global |
2 | CTXT_A | BLACS context | Valid value, as returned by BLACS_GRIDINIT or BLACS_GRIDMAP | Global |
3 | M_A | Number of rows in the global matrix |
If n = 0: M_A >= 0 Otherwise: M_A >= 1 | Global |
4 | N_A | Number of columns in the global matrix | N_A = 1 |
|
5 | MB_A | Row block size | MB_A >= 1 and 0 <= n <= (MB_A)(p)-mod(ia-1,MB_A) | Global |
6 | NB_A | Column block size | NB_A >= 1 | Global |
7 | RSRC_A | The process row over which the first row of the global matrix is distributed | 0 <= RSRC_A < p | Global |
8 | CSRC_A | The process column over which the first column of the global matrix is distributed | CSRC_A = 0 | Global |
9 | -- | Not used by these subroutines. | -- | -- |
Specified as: an array of (at least) length 9, containing fullword
integers.
Table 79. Type-501 Array Descriptor
desc_a | Name | Description | Limits | Scope |
---|---|---|---|---|
1 | DTYPE_A | Descriptor Type | DTYPE_A=501 for 1 × p or
p × 1
where p is the number of processes in a process grid. | Global |
2 | CTXT_A | BLACS context | Valid value, as returned by BLACS_GRIDINIT or BLACS_GRIDMAP | Global |
3 | N_A | Number of columns in the global matrix |
If n = 0: N_A >= 0 Otherwise: N_A >= 1 | Global |
4 | NB_A | Column block size | NB_A >= 1 and 0 <= n <= (NB_A)(p)-mod(ia-1,NB_A) | Global |
5 | CSRC_A | The process column over which the first column of the global matrix is distributed | 0 <= CSRC_A < p | Global |
6 | -- | Not used by these subroutines. | -- | -- |
7 | -- | Reserved | -- | -- |
Specified as: an array of (at least) length 7, containing fullword
integers.
Table 80. Type-1 Array Descriptor (1 × p Process Grid)
desc_a | Name | Description | Limits | Scope |
---|---|---|---|---|
1 | DTYPE_A | Descriptor type | DTYPE_A = 1 for 1 × p
where p is the number of processes in a process grid. | Global |
2 | CTXT_A | BLACS context | Valid value, as returned by BLACS_GRIDINIT or BLACS_GRIDMAP | Global |
3 | M_A | Number of rows in the global matrix | M_A = 1 | Global |
4 | N_A | Number of columns in the global matrix |
If n = 0: N_A >= 0 Otherwise: N_A >= 1 | Global |
5 | MB_A | Row block size | MB_A >= 1 | Global |
6 | NB_A | Column block size | NB_A >= 1 and 0 <= n <= (NB_A)(p)-mod(ia-1,NB_A) | Global |
7 | RSRC_A | The process row over which the first row of the global matrix is distributed | RSRC_A = 0 | Global |
8 | CSRC_A | The process column over which the first column of the global matrix is distributed | 0 <= CSRC_A < p | Global |
9 | -- | Not used by these subroutines. | -- | -- |
Specified as: an array of (at least) length 9, containing fullword integers.
Scope: local
Specified as: a fullword integer, where:
If (the process grid is p × 1 and DTYPE_A = 1) or DTYPE_A = 502:
where, in the above formulas, P is the actual number of processes containing data.
If (the process grid is 1 × p and DTYPE_A = 1) or DTYPE_A = 501, you would substitute NB_A in place of MB_A in the formulas above.
If lwork = 0, work is ignored.
If lwork <> 0, work is the work area used by this subroutine, where:
Scope: local
Specified as: an area of storage containing numbers of data type indicated in Table 76.
Scope:
Specified as: a fullword integer; where:
For PDGTTRF, lwork >= 10P
For PDDTTRF, lwork >= 8P
where, in the above formulas, P is the actual number of processes containing data.
Scope: local
Returned as: a one-dimensional array of (at least) LOCp(ia+n-1), containing numbers of the data type indicated in Table 76.
On output, DL is overwritten; that is, the original input is not preserved.
Scope: local
Returned as: a one-dimensional array of (at least) length LOCp(ia+n-1), containing numbers of the data type indicated in Table 76.
On output, D is overwritten; that is, the original input is not preserved.
Scope: local
Returned as: a one-dimensional array of (at least) length LOCp(ia+n-1), containing numbers of the data type indicated in Table 76.
On output, DU is overwritten; that is, the original input is not preserved.
Scope: local
Returned as: a one-dimensional array of (at least) length LOCp(ia+n-1), containing numbers of the data type indicated in Table 76.
Scope: local
Returned as: an array of (at least) length LOCp(ia+n-1), containing fullword integers. There is no array descriptor for ipiv. The details about the block data distribution of global vector ipiv are stored in desc_a.
Scope: local
Returned as: a one-dimensional array of (at least) length laf, containing numbers of the data type indicated in Table 76.
If lwork <> 0 and lwork <> -1, the size of work is (at least) of length lwork.
If lwork = -1, the size of work is (at least) of length 1.
Scope: local
Returned as: an area of storage, containing numbers of data type indicated in Table 76, where:
Except for work1, the contents of work are overwritten on return.
If info = 0, the factorization or work area query completed successfully.
If 1 <= info <= p, the portion of the global submatrix A stored on process info-1 and factored locally, is singular or reducible (for PDGTTRF), or not diagonally dominant (for PDDTTRF). The magnitude of a pivot element was zero or too small.
If info > p, the portion of the global submatrix A stored on process info-p-1 representing interactions with other processes, is singular or reducible (for PDGTTRF), or not diagonally dominant (for PDDTTRF). The magnitude of a pivot element was zero or too small.
If info > 0, the factorization is completed; however, if you call PDGTTRS/PDDTTRS with these factors, results are unpredictable.
Scope: global
Returned as: a fullword integer; info >= 0.
The factored matrix A is stored in an internal format that depends on the number of processes.
The format of the output from PDDTTRF has changed. Therefore, the factorization and solve must be performed using Parallel ESSL Version 2 Release 1.2, or later.
The scalar data specified for input argument n must be the same for both PDGTTRF/PDDTTRF and PDGTTRS/PDDTTRS.
The global vectors for dl, d, du, du2, and af input to PDGTTRS/PDDTTRS must be the same as the corresponding output arguments for PDGTTRF/PDDTTRF; and thus, the scalar data specified for ia, desc_a, and laf must also be the same.
DTYPE_A | Process Grid |
---|---|
501 | p × 1 or 1 × p |
502 | p × 1 or 1 × p |
1 | p × 1 or 1 × p |
For more information on using block-cyclic data distribution, see Specifying Block-Cyclically-Distributed Matrices for the Banded Linear Algebraic Equations.
Matrix A is a singular or reducible matrix (for PDGTTRF), or not diagonally dominant (for PDDTTRF). For details, see the description of the info argument.
Unable to allocate workspace
If (the process grid is 1 × p and DTYPE_A = 1) or DTYPE_A = 501:
If the process grid is 1 × p and DTYPE_A = 1:
If (the process grid is p × 1 and DTYPE_A = 1) or DTYPE_A = 502:
If the process grid is p × 1 and DTYPE_A = 1:
In all cases:
Each of the following global input arguments are checked to determine whether its value is the same on all processes in the process grid:
If DTYPE_A = 1 on all processes:
If DTYPE_A = 501 on all processes:
If DTYPE_A = 502 on all processes:
Also:
This example shows a factorization of the general tridiagonal matrix A of order 12.
* * | 2.0 2.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 | | 1.0 3.0 2.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 | | 0.0 1.0 3.0 2.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 | | 0.0 0.0 1.0 3.0 2.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 | | 0.0 0.0 0.0 1.0 3.0 2.0 0.0 0.0 0.0 0.0 0.0 0.0 | | 0.0 0.0 0.0 0.0 1.0 3.0 2.0 0.0 0.0 0.0 0.0 0.0 | | 0.0 0.0 0.0 0.0 0.0 1.0 3.0 2.0 0.0 0.0 0.0 0.0 | | 0.0 0.0 0.0 0.0 0.0 0.0 1.0 3.0 2.0 0.0 0.0 0.0 | | 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 3.0 2.0 0.0 0.0 | | 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 3.0 2.0 0.0 | | 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 3.0 2.0 | | 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 3.0 | * *
Matrix A is stored in tridiagonal storage mode and is distributed over a 3 × 1 process grid using block-cyclic distribution.
Notes:
ORDER = 'R' NPROW = 3 NPCOL = 1 CALL BLACS_GET (0, 0, ICONTXT) CALL BLACS_GRIDINIT(ICONTXT, ORDER, NPROW, NPCOL) CALL BLACS_GRIDINFO(ICONTXT, NPROW, NPCOL, MYROW, MYCOL) N DL D DU DU2 IA DESC_A IPIV AF LAF WORK LWORK INFO | | | | | | | | | | | | | CALL PDGTTRF( 12 , DL , D , DU , DU2 , 1 , DESC_A , IPIV , AF , 48 , WORK , 0 , INFO )
| Desc_A |
---|---|
DTYPE_ | 502 |
CTXT_ | icontxt(CGBTOO) |
M_ | 12 |
MB_ | 4 |
RSRC_ | 0 |
Not used | -- |
Reserved | -- |
Notes: |
Global vector dl with block size of 4:
B,D 0 * * | . | | 1.0 | 0 | 1.0 | | 1.0 | | --- | | 1.0 | | 1.0 | 1 | 1.0 | | 1.0 | | --- | | 1.0 | | 1.0 | 2 | 1.0 | | 1.0 | * *
Global vector d with block size of 4:
B,D 0 * * | 2.0 | | 3.0 | 0 | 3.0 | | 3.0 | | --- | | 3.0 | | 3.0 | 1 | 3.0 | | 3.0 | | --- | | 3.0 | | 3.0 | 2 | 3.0 | | 3.0 | * *
Global vector du with block size of 4:
B,D 0 * * | 2.0 | | 2.0 | 0 | 2.0 | | 2.0 | | --- | | 2.0 | | 2.0 | 1 | 2.0 | | 2.0 | | --- | | 2.0 | | 2.0 | 2 | 2.0 | | . | * *
The following is the 3 × 1 process grid:
B,D | 0 -----| ------- 0 | P00 -----| ------- 1 | P10 -----| ------- 2 | P20 -----| -------
Local array DL with block size of 4:
p,q | 0 -----|----- | . | 1.0 0 | 1.0 | 1.0 -----|----- | 1.0 | 1.0 1 | 1.0 | 1.0 -----|----- | 1.0 | 1.0 2 | 1.0 | 1.0
Local array D with block size of 4:
p,q | 0 -----|----- | 2.0 | 3.0 0 | 3.0 | 3.0 -----|----- | 3.0 | 3.0 1 | 3.0 | 3.0 -----|----- | 3.0 | 3.0 2 | 3.0 | 3.0
Local array DU with block size of 4:
p,q | 0 -----|----- | 2.0 | 2.0 0 | 2.0 | 2.0 -----|----- | 2.0 | 2.0 1 | 2.0 | 2.0 -----|----- | 2.0 | 2.0 2 | 2.0 | .
Output:
Global vector dl with block size of 4:
B,D 0 * * | . | | 0.5 | 0 | 0.5 | | 0.5 | | ---- | | 1.0 | | 0.33 | 1 | 0.43 | | 0.47 | | ---- | | 1.0 | | 1.0 | 2 | 1.0 | | 1.0 | * *
Global vector d with block size of 4:
B,D 0 * * | 0.5 | | 0.5 | 0 | 0.5 | | 2.0 | | ---- | | 0.33 | | 0.43 | 1 | 0.47 | | 2.07 | | ---- | | 2.07 | | 0.47 | 2 | 0.43 | | 0.33 | * *
Global vector du with block size of 4:
B,D 0 * * | 2.0 | | 2.0 | 0 | 2.0 | | 2.0 | | ---- | | 2.0 | | 2.0 | 1 | 2.0 | | 2.0 | | ---- | | 0.93 | | 0.86 | 2 | 0.67 | | . | * *
Global vector ipiv with block size of 4:
B,D 0 * * | 0 | | 0 | 0 | 0 | | 0 | | - | | 0 | | 0 | 1 | 0 | | 0 | | - | | 0 | | 0 | 2 | 0 | | 0 | * *
The following is the 3 × 1 process grid:
B,D | 0 -----| ------- 0 | P00 -----| ------- 1 | P10 -----| ------- 2 | P20
Local array DL with block size of 4:
p,q | 0 -----|------ | . | 0.5 0 | 0.5 | 0.5 -----|------ | 1.0 | 0.33 1 | 0.43 | 0.47 -----|------ | 1.0 | 1.0 2 | 1.0 | 1.0
Local array D with block size of 4:
p,q | 0 -----|------ | 0.5 | 0.5 0 | 0.5 | 2.0 -----|------ | 0.33 | 0.43 1 | 0.47 | 2.07 -----|------ | 2.07 | 0.47 2 | 0.43 | 0.33
Local array DU with block size of 4:
p,q | 0 -----|------ | 2.0 | 2.0 0 | 2.0 | 2.0 -----|------ | 2.0 | 2.0 1 | 2.0 | 2.0 -----|------ | 0.93 | 0.86 2 | 0.67 | .
Local array IPIV with block size of 4:
p,q | 0 -----|--- | 0 | 0 0 | 0 | 0 -----|--- | 0 | 0 1 | 0 | 0 -----|--- | 0 | 0 2 | 0 | 0
The value of info is 0 on all processes.
This example shows a factorization of the diagonally dominant general tridiagonal matrix A of order 12. Matrix A is stored in tridiagonal storage mode and distributed over a 3 × 1 process grid using block-cyclic distribution.
Matrix A and the input and/or output values for dl, d, du, desc_a, and info in this example are the same as shown for Example 1.
Notes:
ORDER = 'R' NPROW = 3 NPCOL = 1 CALL BLACS_GET (0, 0, ICONTXT) CALL BLACS_GRIDINIT(ICONTXT, ORDER, NPROW, NPCOL) CALL BLACS_GRIDINFO(ICONTXT, NPROW, NPCOL, MYROW, MYCOL) N DL D DU IA DESC_A AF LAF WORK LWORK INFO | | | | | | | | | | | CALL PDDTTRF( 12 , DL , D , DU , 1 , DESC_A , AF , 44 , WORK , 0 , INFO )