Parallel Engineering and Scientific Subroutine Library for AIX Version 2 Release 3: Guide and Reference

PDGTTRS and PDDTTRS--General Tridiagonal Matrix Solve

PDGTTRS solves the tridiagonal systems of linear equations, using Gaussian elimination with partial pivoting for the general tridiagonal matrix A stored in tridiagonal storage mode.

1. AX = B

PDDTTRS solves one of the following tridiagonal systems of linear equations, using Gaussian elimination for the diagonally dominant general tridiagonal matrix A stored in tridiagonal storage mode.

1. AX = B

2. A^TX = B

In these subroutines:

A represents the global square general tridiagonal submatrix A_{ia:ia+n-1,
ia:ia+n-1}.
B represents the global general submatrix B_{ib:ib+n-1,
1:nrhs} containing the right-hand sides in its columns.
X represents the global general submatrix B_{ib:ib+n-1,
1:nrhs} containing the output solution vectors in its columns.

These subroutines use the results of the factorization of matrix A, produced by a preceding call to PDGTTRF or PDDTTRF, respectively. The output from the factorization subroutines, PDGTTRF and PDDTTRF, should be used only as input to these solve subroutines, respectively.

If n = 0 or nrhs = 0, no computation is performed and the subroutine returns after doing some parameter checking. See reference [51].

Table 81. Data Types

dl, d, du, *du2*, B, `af`, `work`	*ipiv*	Subroutine
Long-precision real	Integer	PDGTTRS and PDDTTRS

Syntax

Fortran	CALL PDGTTRS `(transa`, `n`, `nrhs`, `dl`, `d`, `du`, `du2`, `ia`, `desc_a`, `ipiv`, `b`, `ib`, `desc_b`, `af`, `laf`, `work`, `lwork`, `info)` CALL PDDTTRS `(transa`, `n`, `nrhs`, `dl`, `d`, `du`, `ia`, `desc_a`, `b`, `ib`, `desc_b`, `af`, `laf`, `work`, `lwork`, `info)`
C and C++	pdgttrs `(transa`, `n`, `nrhs`, `dl`, `d`, `du`, `du2`, `ia`, `desc_a`, `ipiv`, `b`, `ib`, `desc_b`, `af`, `laf`, `work`, `lwork`, `info);` pddttrs `(transa`, `n`, `nrhs`, `dl`, `d`, `du`, `ia`, `desc_a`, `b`, `ib`, `desc_b`, `af`, `laf`, `work`, `lwork`, `info);`

On Entry

transa

indicates submatrix A is used in the computation, resulting in solution 1.

Scope: global

Specified as: a single character, where:

For PDGTTRS, it must be 'N'.
For PDDTTRS, it must be 'N', 'T', or 'C'.

n

is the order of the general tridiagonal submatrix A and the number of rows in the general submatrix B, which contains the multiple right-hand sides.

Scope: global

Specified as: a fullword integer, where:

If (the process grid is p × 1 and DTYPE_A = 1) or DTYPE_A = 502, 0 <= n <= (MB_A)(p)-mod(ia-1,MB_A).
If (the process grid is 1 × p and DTYPE_A = 1) or DTYPE_A = 501, 0 <= n <= (NB_A)(p)-mod(ia-1,NB_A).

where p is the number of processes in a process grid.

nrhs

is the number of right-hand sides; that is, the number of columns in submatrix B used in the computation.

Scope: global

Specified as: a fullword integer; nrhs >= 0.

dl

is the local part of the global vector dl, containing part of the factorization produced from a preceding call to PDGTTRF or PDDTTRF. This identifies the first element of the local array DL. These subroutines compute the location of the first element of the local subarray used, based on ia, desc_a, and p; therefore, the leading LOCp(ia+n-1) part of the local array DL contains the local pieces of the leading ia+n-1 part of the global vector.

Scope: local

Specified as: a one-dimensional array of (at least) length LOCp(ia+n-1), containing numbers of the data type indicated in Table 81. Details about block-cyclic data distribution of global matrix A are stored in desc_a.

d

is the local part of the global vector d, containing part of the factorization produced from a preceding call to PDGTTRF or PDDTTRF. This identifies the first element of the local array D. These subroutines compute the location of the first element of the local subarray used, based on ia, desc_a, and p; therefore, the leading LOCp(ia+n-1) part of the local array D contains the local pieces of the leading ia+n-1 part of the global vector.

Scope: local

du

is the local part of the global vector du, containing part of the factorization produced from a preceding call to PDGTTRF or PDDTTRF. This identifies the first element of the local array DU. These subroutines compute the location of the first element of the local subarray used, based on ia, desc_a, and p; therefore, the leading LOCp(ia+n-1) part of the local array DU contains the local pieces of the leading ia+n-1 part of the global vector.

Scope: local

du2

is the local part of the global vector du2, containing part of the factorization produced from a preceding call to PDGTTRF. This identifies the first element of the local array DU2. These subroutines compute the location of the first element of the local subarray used, based on ia, desc_a, and p; therefore, the leading LOCp(ia+n-1) part of the local array DU2 contains the local pieces of the leading ia+n-1 part of the global vector.

Scope: local

ia

is the row or column index of the global matrix A, identifying the first row or column of the submatrix A.

Scope: global

Specified as: a fullword integer, where:

If (the process grid is p × 1 and DTYPE_A = 1) or DTYPE_A = 502, 1 <= ia <= M_A and ia+n-1 <= M_A
If (the process grid is 1 × p and DTYPE_A = 1) or DTYPE_A = 501, 1 <= ia <= N_A and ia+n-1 <= N_A

desc_a

is the array descriptor for global matrix A. Because vectors are one-dimensional data structures, you may use a type-502, type-501, or type-1 array descriptor regardless of whether the process grid is p × 1 or 1 × p. For a type-502 array descriptor, the process grid is used as if it is a p × 1 process grid. For a type-501 array descriptor, the process grid is used as if it is a 1 × p process grid. For a type-1 array descriptor, the process grid is used as if it is either a p × 1 process grid or a 1 × p process grid. The following tables describe three types of array descriptors. For rules on using array descriptors, see Notes and Coding Rules.

Table 82. Type-502 Array Descriptor

`desc_a`	Name	Description	Limits	Scope
1	DTYPE_A	Descriptor Type	DTYPE_A=502 for `p` × 1 or 1 × `p` where `p` is the number of processes in a process grid.	Global
2	CTXT_A	BLACS context	Valid value, as returned by BLACS_GRIDINIT or BLACS_GRIDMAP	Global
3	M_A	Number of rows in the global matrix	If `n` = 0: M_A >= 0 Otherwise: M_A >= 1	Global
4	MB_A	Row block size	MB_A >= 1 and 0 <= `n` <= (MB_A)(`p`)-mod(`ia`-1,MB_A)	Global
5	RSRC_A	The process row over which the first row of the global matrix is distributed	0 <= RSRC_A < `p`	Global
6	--	Not used by these subroutines.	--	--
7	--	Reserved	--	--

Specified as: an array of (at least) length 7, containing fullword integers.

Table 83. Type-1 Array Descriptor (p × 1 Process Grid)

`desc_a`	Name	Description	Limits	Scope
1	DTYPE_A	Descriptor Type	DTYPE_A = 1 for `p` × 1 where `p` is the number of processes in a process grid.	Global
2	CTXT_A	BLACS context	Valid value, as returned by BLACS_GRIDINIT or BLACS_GRIDMAP	Global
3	M_A	Number of rows in the global matrix	If `n` = 0: M_A >= 0 Otherwise: M_A >= 1	Global
4	N_A	Number of columns in the global matrix	N_A = 1
5	MB_A	Row block size	MB_A >= 1 and 0 <= `n` <= (MB_A)(`p`)-mod(`ia`-1,MB_A)	Global
6	NB_A	Column block size	NB_A >= 1	Global
7	RSRC_A	The process row over which the first row of the global matrix is distributed	0 <= RSRC_A < `p`	Global
8	CSRC_A	The process column over which the first column of the global matrix is distributed	CSRC_A = 0	Global
9	--	Not used by these subroutines.	--	--

Specified as: an array of (at least) length 9, containing fullword integers.

Table 84. Type-501 Array Descriptor

`desc_a`	Name	Description	Limits	Scope
1	DTYPE_A	Descriptor Type	DTYPE_A=501 for 1 × `p` or `p` × 1 where `p` is the number of processes in a process grid.	Global
2	CTXT_A	BLACS context	Valid value, as returned by BLACS_GRIDINIT or BLACS_GRIDMAP	Global
3	N_A	Number of columns in the global matrix	If `n` = 0: N_A >= 0 Otherwise: N_A >= 1	Global
4	NB_A	Column block size	NB_A >= 1 and 0 <= `n` <= (NB_A)(`p`)-mod(`ia`-1,NB_A)	Global
5	CSRC_A	The process column over which the first column of the global matrix is distributed	0 <= CSRC_A < `p`	Global
6	--	Not used by these subroutines.	--	--
7	--	Reserved	--	--

Specified as: an array of (at least) length 7, containing fullword integers.

Table 85. Type-1 Array Descriptor (1 × p Process Grid)

`desc_a`	Name	Description	Limits	Scope
1	DTYPE_A	Descriptor type	DTYPE_A = 1 for 1 × `p` where `p` is the number of processes in a process grid.	Global
2	CTXT_A	BLACS context	Valid value, as returned by BLACS_GRIDINIT or BLACS_GRIDMAP	Global
3	M_A	Number of rows in the global matrix	M_A = 1	Global
4	N_A	Number of columns in the global matrix	If `n` = 0: N_A >= 0 Otherwise: N_A >= 1	Global
5	MB_A	Row block size	MB_A >= 1	Global
6	NB_A	Column block size	NB_A >= 1 and 0 <= `n` <= (NB_A)(`p`)-mod(`ia`-1,NB_A)	Global
7	RSRC_A	The process row over which the first row of the global matrix is distributed	RSRC_A = 0	Global
8	CSRC_A	The process column over which the first column of the global matrix is distributed	0 <= CSRC_A < `p`	Global
9	--	Not used by these subroutines.	--	--

Specified as: an array of (at least) length 9, containing fullword integers.

ipiv

is the local part of the global vector ipiv, containing the pivot indices produced on a preceding call to PDGTTRF. This identifies the first element of the local array IPIV. This subroutine computes the location of the first element of the local subarray used, based on ia, desc_a, and p; therefore, the leading LOCp(ia+n-1) part of the local array IPIV must contain the local pieces of the leading ia+n-1 part of the global vector.

Scope: local

Specified as: an array of (at least) LOCp(ia+n-1), containing fullword integers. There is no array descriptor for ipiv. The details about the block-cyclic data distribution of global matrix A are stored in desc_a.

b

is the local part of the global general matrix B, containing the multiple right-hand sides of the system. This identifies the first element of the local array B. This subroutine computes the location of the first element of the local subarray used, based on ib, desc_b, and p; therefore, the leading LOCp(ib+n-1) by nrhs part of the local array B must contain the local pieces of the leading ib+n-1 by nrhs part of the global matrix.

Scope: local

Specified as: an LLD_B by (at least) nrhs array, containing numbers of the data type indicated in Table 81. Details about the block-cyclic data distribution of global matrix B are stored in desc_b.

ib

is the row index of the global matrix B, identifying the first row of the submatrix B.

Scope: global

Specified as: a fullword integer; 1 <= ib <= M_B and ib+n-1 <= M_B.

desc_b

is the array descriptor for global matrix B, which may be type 502 or type 1, as described in the following tables. For type-502 array descriptor, the process grid is used as if it is a p × 1 process grid. For rules on using array descriptors, see Notes and Coding Rules.

`desc_b`	Name	Description	Limits	Scope
1	DTYPE_B	Descriptor type	DTYPE_B = 502 for `p` × 1 or 1 × `p` where `p` is the number of processes in a process grid.	Global
2	CTXT_B	BLACS context	Valid value, as returned by BLACS_GRIDINIT or BLACS_GRIDMAP	Global
3	M_B	Number of rows in the global matrix	If `n` = 0: M_B >= 0 Otherwise: M_B >= 1	Global
4	MB_B	Row block size	MB_B >= 1 and 0 <= `n` <= (MB_B)`p`-mod(`ib`-1,MB_B)	Global
5	RSRC_B	The process row over which the first row of the global matrix is distributed	0 <= RSRC_B < `p`	Global
6	LLD_B	Leading dimension	LLD_B >= max(1, LOCp(M_B))	Local
7	--	Reserved	--	--

Specified as: an array of (at least) length 7, containing fullword integers.

`desc_b`	Name	Description	Limits	Scope
1	DTYPE_B	Descriptor type	DTYPE_B = 1 for `p` × 1 where `p` is the number of processes in a process grid.	Global
2	CTXT_B	BLACS context	Valid value, as returned by BLACS_GRIDINIT or BLACS_GRIDMAP	Global
3	M_B	Number of rows in the global matrix	If `n` = 0: M_B >= 0 Otherwise: M_B >= 1	Global
4	N_B	Number of columns in the global matrix	N_B >= `nrhs`	Global
5	MB_B	Row block size	MB_B >= 1 and 0 <= `n` <= (MB_B)`p`-mod(`ib`-1,MB_B)	Global
6	NB_B	Column block size	NB_B >= 1	Global
7	RSRC_B	The process row over which the first row of the global matrix is distributed	0 <= RSRC_B < `p`	Global
8	CSRC_B	The process column over which the first column of the global matrix is distributed	CSRC_B = 0	Global
9	LLD_B	Leading dimension	LLD_B >= max(1, LOCp(M_B))	Local

Specified as: an array of (at least) length 9, containing fullword integers.

af

is a work area used by these subroutines and contains part of the factorization produced on a preceding call to PDGTTRF or PDDTTRF. Its size is specified by laf.

Scope: local

Specified as: a one-dimensional array of (at least) length laf, containing numbers of the data type indicated in Table 81.

laf

is the number of elements in array AF.

Scope: local

Specified as: a fullword integer, where:

If (the process grid is p × 1 and DTYPE_A = 1) or DTYPE_A = 502:

For PDGTTRS, laf >= 12P+3(MB_A)
For PDDTTRS, laf >= 12P+2(MB_A).

where, in the above formulas, P is the actual number of processes containing data.

If (the process grid is 1 × p and DTYPE_A = 1) or DTYPE_A = 501, you would substitute NB_A in place of MB_A in the formulas above.

Note:: In ScaLAPACK 1.5, PDDTTRS requires laf = 12P+3(NB_A). This value is greater than or equal to the value required by Parallel ESSL.

work

has the following meaning:

If lwork = 0, work is ignored.

If lwork <> 0, work is the work area used by this subroutine, where:

If lwork <> -1, the size of work is (at least) of length lwork.
If lwork = -1, the size of work is (at least) of length 1.

Scope: local

Specified as: an area of storage containing numbers of data type indicated in Table 81.

lwork

is the number of elements in array WORK.

Scope:

If lwork >= 0, lwork is local
If lwork = -1, lwork is global

Specified as: a fullword integer; where:

If lwork = 0, PDGTTRS and PDDTTRS dynamically allocate the work area used by the subroutine. The work area is deallocated before control is returned to the calling program. This option is an extension to the ScaLAPACK standard.
If lwork = -1, PDGTTRS and PDDTTRS perform a work area query and return the optimum size of work in work₁. No computation is performed and the subroutine returns after error checking is complete.
Otherwise, lwork must have the following value:
- For PDGTTRS, lwork >= 12P+5(nrhs).
- For PDDTTRS, lwork >= 10P+4(nrhs)
where, in the above formulas, P is the actual number of processes containing data.

info

See On Return.

On Return

b

b is the updated local part of the global matrix B, containing the solution vectors.

Scope: local

Returned as: an LLD_B by (at least) nrhs array, containing numbers of the data type indicated in Table 81.

work

is the work area used by this subroutine if lwork <> 0, where:

If lwork <> 0 and lwork <> -1, the size of work is (at least) of length lwork.

If lwork = -1, the size of work is (at least) of length 1.

Scope: local

Returned as: an area of storage, containing numbers of data type indicated in Table 81, where:

If lwork = -1, the work₁ is set to the optimum lwork value needed.
If lwork >= 1, the work₁ is set to the minimum lwork value needed.

Except for work₁, the contents of work are overwritten on return.

info

indicates that a successful computation or work area query occurred.

Scope: global

Returned as: a fullword integer; info = 0.

Notes and Coding Rules

In your C program, argument info must be passed by reference.
The subroutine accepts lowercase letters for the transa argument.
The output from the factorization subroutines should be used only as input to the solve subroutines PDGTTRS and PDDTTRS, respectively.
The factored matrix A is stored in an internal format that depends on the number of processes.
The format of the output from PDDTTRF has changed. Therefore, the factorization and solve must be performed using Parallel ESSL Version 2 Release 1.2, or later.
The scalar data specified for input argument n must be the same for both PDGTTRF/PDDTTRF and PDGTTRS/PDDTTRS.
The global vectors for dl, d, du, du2, ipiv, and af input to PDGTTRS/PDDTTRS must be the same as the corresponding output arguments for PDGTTRF/PDDTTRF; and thus, the scalar data specified for ia, desc_a, and laf must also be the same.
In all cases, follow these rules:
- ia = ib
- CTXT_A = CTXT_B
- If (the process grid is p × 1 and DTYPE_A = 1) or DTYPE_A = 502, MB_A = MB_B.
- If (the process grid is 1 × p and DTYPE_A = 1) or DTYPE_A = 501, NB_A = MB_B.
- If DTYPE_A=1, then:
  - For a p × 1 process grid (where p>1), N_A=1, NB_A>=1, and CSRC_A=0.
  - For a 1 × p process grid (where p>1), M_A=1, MB_A>=1, and RSRC_A=0.
  - For a 1 × 1 process grid:
    - If N_A=1, NB_A>=1 and CSRC_A=0.
    - If M_A=1, MB_A>=1 and RSRC_A=0.
- If DTYPE_B=1, N_B>=nrhs, NB_B>=1, and CSRC_B=0.
- Following are the consistent combinations of array descriptor types and process grids, where p is the number of processes in the process grid:
  
  DTYPE_A DTYPE_B Process Grid
  501 502 p × 1 or 1 × p
  502 502 p × 1 or 1 × p
  501 1 p × 1
  502 1 p × 1
  1 502 p × 1 or 1 × p
  1 1 p × 1
To determine the values of LOCp(n) used in the argument descriptions, see Determining the Number of Rows and Columns in Your Local Arrays for descriptor type-1 or Determining the Number of Rows or Columns in Your Local Arrays for descriptor type-501 and type-502.
dl, d, du, du2, ipiv, af and work must have no common elements; otherwise, results are unpredictable.
The global general tridiagonal matrix A must be stored in tridiagonal storage mode and distributed over a one-dimensional process grid, using block-cyclic data distribution. See the section on block-cyclically distributing a tridiagonal matrix in Matrices.
For more information on using block-cyclic data distribution, see Specifying Block-Cyclically-Distributed Matrices for the Banded Linear Algebraic Equations.
Matrix B must be distributed over a one-dimensional process grid, using block-cyclic data distribution. For more information using block-cyclic data distribution, see Specifying Block-Cyclically-Distributed Matrices for the Banded Linear Algebraic Equations. Also, see the section on distributing the right-hand side matrix in Matrices.
If lwork = -1 on any process, it must equal -1 on all processes. That is, if a subset of the processes specifies -1 for the work area size, they must all specify -1.
Although global matrices A and B may be block-cyclically distributed on a 1 × p or p × 1 process grid, the values of n, ia, ib, MB_A (if (the process grid is p × 1 and DTYPE_A = 1) or DTYPE_A = 502), NB_A (if (the process grid is 1 × p and DTYPE_A = 1) or DTYPE_A = 501), must be chosen so that each process has at most one full or partial block of each of the global submatrices A and B.
For global tridiagonal matrix A, use of the type-1 array descriptor is an extension to ScaLAPACK 1.5. If your application needs to run with both Parallel ESSL and ScaLAPACK 1.5, it is suggested that you use either a type-501 or a type-502 array descriptor for the matrix A.

DTYPE_A	DTYPE_B	Process Grid
501	502	`p` × 1 or 1 × `p`
502	502	`p` × 1 or 1 × `p`
501	1	`p` × 1
502	1	`p` × 1
1	502	`p` × 1 or 1 × `p`
1	1	`p` × 1

Error Conditions

Computational Errors

None

Note:: If the factorization performed by PDGTTRF or PDDTTRF failed because matrix A is singular or reducible, or is not diagonally dominant, respectively, the results returned by this subroutine are unpredictable. For details, see the info output argument for PDGTTRF or PDDTTRF.

Resource Errors

Unable to allocate workspace

Input-Argument and Miscellaneous Errors

Stage 1:

DTYPE_A is invalid.
DTYPE_B is invalid.

Stage 2:

CTXT_A is invalid.

Stage 3:

This subroutine was called from outside the process grid.

Stage 4:

Note:

In the following error conditions:

If M_A = 1 and DTYPE_A = 1, a 1 × 1 process grid is treated as a 1 × p process grid.
If N_A = 1 and DTYPE_A = 1, a 1 × 1 process grid is treated as a p × 1 process grid.

The process grid is not 1 × p or p × 1.
CTXT_A <> CTXT_B
transa <>
- 'N' for PDGTTRS
- 'N', 'T', or 'C' for PDDTTRS
n < 0
ia < 1
DTYPE_A = 1 and M_A <> 1 and N_A <> 1

If (the process grid is 1 × p and DTYPE_A = 1) or DTYPE_A = 501:
N_A < 0 and (n = 0); N_A < 1 otherwise
NB_A < 1
n > (NB_A)(p)-mod(ia-1,NB_A)
ia > N_A and (n > 0)
ia+n-1 > N_A and (n > 0)
CSRC_A < 0 or CSRC_A >= p
NB_A <> MB_B
CSRC_A <> RSRC_B

If the process grid is 1 × p and DTYPE_A = 1:
M_A <> 1
MB_A < 1
RSRC_A <> 0

If (the process grid is p × 1 and DTYPE_A = 1) or DTYPE_A = 502:
M_A < 0 and (n = 0); M_A < 1 otherwise
MB_A < 1
n > (MB_A)(p)-mod(ia-1,MB_A)
ia > MB_A and (n > 0)
ia+n-1 > M_A and (n > 0)
RSRC_A < 0 or RSRC_A >= p
MB_A <> MB_B
RSRC_A <> RSRC_B

If the process grid is p × 1 and DTYPE_A = 1:
N_A <> 1
NB_A < 1
CSRC_A <> 0

In all cases:
ia <> ib
DTYPE_B = 1 and the process grid is 1 × p and p > 1
nrhs < 0
ib < 1
M_B < 0 and (n = 0); M_B < 1 otherwise
MB_B < 1
ib > M_B and (n > 0)
ib+n-1 > M_B and (n > 0)
RSRC_B < 0 or RSRC_B >= p
LLD_B < max(1,LOCp(M_B))

If DTYPE_B = 1:
N_B < 0 and (nrhs = 0); N_B < 1 otherwise
N_B < nrhs
NB_B < 1
CSRC_B <> 0

In all cases:
laf < (minimum value) (For the minimum value, see the laf argument description.)
lwork <> 0, lwork <> -1, and lwork < (minimum value) (For the minimum value, see the lwork argument description.)

Stage 5:

Each of the following global input arguments are checked to determine whether its value is the same on all processes in the process grid:

n differs.
nrhs differs.
transa differs.
ia differs.
ib differs.
DTYPE_A differs.

If DTYPE_A = 1 on all processes:
M_A differs.
N_A differs.
MB_A differs.
NB_A differs.
RSRC_A differs.
CSRC_A differs.

If DTYPE_A = 501 on all processes:
N_A differs.
NB_A differs.
CSRC_A differs.

If DTYPE_A = 502 on all processes:
M_A differs.
MB_A differs.
RSRC_A differs.

In all cases:
DTYPE_B differs.

If DTYPE_B = 1 on all processes:
M_B differs.
N_B differs.
MB_B differs.
NB_B differs.
RSRC_B differs.
CSRC_B differs.

If DTYPE_B = 502 on all processes:
M_B differs.
MB_B differs.
RSRC_B differs.

Also:
lwork = -1 on a subset of processes.

Example 1

This example shows how to solve the system AX=B, where matrix A is the same general tridiagonal matrix factored in Example 1 for PDGTTRF.

Notes:

The vectors dl, d, and du, output from PDGTTRF, are stored in an internal format that depends on the number of processes. These vectors are passed, unchanged, to the solve subroutine PDGTTRS.
The contents of these du2 and af vectors, output from PDGTTRF, are not shown. These vectors are passed, unchanged, to the solve subroutine PDGTTRS.
Because lwork = 0, PDGTTRS dynamically allocates the work area used by this subroutine.

Call Statements and Input

 ORDER = 'R'
 NPROW = 3
 NPCOL = 1
 CALL BLACS_GET (0, 0, ICONTXT)
 CALL BLACS_GRIDINIT(ICONTXT, ORDER, NPROW, NPCOL)
 CALL BLACS_GRIDINFO(ICONTXT, NPROW, NPCOL, MYROW, MYCOL)
 
             TRANSA N  NRHS DL   D   DU   DU2  IA   DESC_A   IPIV   B  IB
               |    |   |   |    |    |    |    |      |      |     |   |
 CALL PDGTTRS( N , 12 , 3 , DL , D , DU , DU2 , 1 , DESC_A , IPIV , B , 1 ,
 
               DESC_B   AF   LAF  WORK  LWORK INFO
                   |    |     |     |     |    |
               DESC_B , AF , 48 , WORK , 0  , INFO )

	Desc_A
DTYPE_	502
CTXT_	`icontxt`^(CGBTOO2)
M_	12
MB_	4
RSRC_	0
Not used	--
Reserved	--
Notes: `icontxt` is the output of the BLACS_GRIDINIT call.

	Desc_B
DTYPE_	502
CTXT_	`icontxt`^(CGBTOO3)
M_	12
MB_	4
RSRC_	0
LLD_B	4
Reserved	--
Notes: `icontxt` is the output of the BLACS_GRIDINIT call.

Global vector dl with block size of 4:

B,D     0
     *      *
     |  .   |
     | 0.5  |
 0   | 0.5  |
     | 0.5  |
     | ---- |
     | 1.0  |
     | 0.33 |
 1   | 0.43 |
     | 0.47 |
     | ---- |
     | 1.0  |
     | 1.0  |
 2   | 1.0  |
     | 1.0  |
     *      *

Global vector d with block size of 4:

B,D     0
     *      *
     | 0.5  |
     | 0.5  |
 0   | 0.5  |
     | 2.0  |
     | ---- |
     | 0.33 |
     | 0.43 |
 1   | 0.47 |
     | 2.07 |
     | ---- |
     | 2.07 |
     | 0.47 |
 2   | 0.43 |
     | 0.33 |
     *      *

Global vector du with block size of 4:

B,D     0
     *      *
     | 2.0  |
     | 2.0  |
 0   | 2.0  |
     | 2.0  |
     | ---- |
     | 2.0  |
     | 2.0  |
 1   | 2.0  |
     | 2.0  |
     | ---- |
     | 0.93 |
     | 0.86 |
 2   | 0.67 |
     |  .   |
     *      *

Global vector ipiv with block size of 4:

B,D    0
     *   *
     | 0 |
     | 0 |
 0   | 0 |
     | 0 |
     | - |
     | 0 |
     | 0 |
 1   | 0 |
     | 0 |
     | - |
     | 0 |
     | 0 |
 2   | 0 |
     | 0 |
     *   *

The following is the 3 × 1 process grid:

B,D  |   0   
-----|------- 
0    |   P₀₀
-----|------- 
1    |   P₁₀
-----|------- 
2    |   P₂₀

Local array DL with block size of 4:

p,q  |  0
-----|------
     |  .
     | 0.5
 0   | 0.5
     | 0.5
-----|------
     | 1.0
     | 0.33
 1   | 0.43
     | 0.47
-----|------
     | 1.0
     | 1.0
 2   | 1.0
     | 1.0

Local array D with block size of 4:

p,q  |  0
-----|------
     | 0.5
     | 0.5
 0   | 0.5
     | 2.0
-----|------
     | 0.33
     | 0.43
 1   | 0.47
     | 2.07
-----|------
     | 2.07
     | 0.47
 2   | 0.43
     | 0.33

Local array DU with block size of 4:

p,q  |  0
-----|------
     | 2.0
     | 2.0
 0   | 2.0
     | 2.0
-----|------
     | 2.0
     | 2.0
 1   | 2.0
     | 2.0
-----|------
     | 0.93
     | 0.86
 2   | 0.67
     |  .

Local array IPIV with block size of 4:

p,q  | 0
-----|---
     | 0
     | 0
 0   | 0
     | 0
-----|---
     | 0
     | 0
 1   | 0
     | 0
-----|---
     | 0
     | 0
 2   | 0
     | 0

Global matrix B with block size of 4:

B,D          0
     *                *
     | 46.0  6.0  4.0 |
     | 65.0 13.0  6.0 |
 0   | 59.0 19.0  6.0 |
     | 53.0 25.0  6.0 |
     | -------------- |
     | 47.0 31.0  6.0 |
     | 41.0 37.0  6.0 |
 1   | 35.0 43.0  6.0 |
     | 29.0 49.0  6.0 |
     | -------------- |
     | 23.0 55.0  6.0 |
     | 17.0 61.0  6.0 |
 2   | 11.0 67.0  6.0 |
     |  5.0 47.0  4.0 |
     *                *

The following is the 3 × 1 process grid:

B,D  |   0   
-----|------- 
0    |   P₀₀
-----|------- 
1    |   P₁₀
-----|------- 
2    |   P₂₀

Local matrix B with block size of 4:

p,q  |       0
-----|----------------
     | 46.0  6.0  4.0
     | 65.0 13.0  6.0
 0   | 59.0 19.0  6.0
     | 53.0 25.0  6.0
-----|----------------
     | 47.0 31.0  6.0
     | 41.0 37.0  6.0
 1   | 35.0 43.0  6.0
     | 29.0 49.0  6.0
-----|----------------
     | 23.0 55.0  6.0
     | 17.0 61.0  6.0
 2   | 11.0 67.0  6.0
     |  5.0 47.0  4.0

Output:

Global matrix B with block size of 4:

B,D           0
     *                 *
     | 12.0  1.0  1.0  |
     | 11.0  2.0  1.0  |
 0   | 10.0  3.0  1.0  |
     |  9.0  4.0  1.0  |
     | --------------- |
     |  8.0  5.0  1.0  |
     |  7.0  6.0  1.0  |
 1   |  6.0  7.0  1.0  |
     |  5.0  8.0  1.0  |
     | --------------- |
     |  4.0   9.0  1.0 |
     |  3.0  10.0  1.0 |
 2   |  2.0  11.0  1.0 |
     |  1.0  12.0  1.0 |
     *                 *

The following is the 3 × 1 process grid:

B,D  |   0   
-----|------- 
0    |   P₀₀
-----|------- 
1    |   P₁₀
-----|------- 
2    |   P₂₀

Local matrix B with block size of 4:

p,q  |        0
-----|-----------------
     | 12.0  1.0  1.0
     | 11.0  2.0  1.0
 0   | 10.0  3.0  1.0
     |  9.0  4.0  1.0
-----|-----------------
     |  8.0  5.0  1.0
     |  7.0  6.0  1.0
 1   |  6.0  7.0  1.0
     |  5.0  8.0  1.0
-----|-----------------
     |  4.0   9.0  1.0
     |  3.0  10.0  1.0
 2   |  2.0  11.0  1.0
     |  1.0  12.0  1.0

The value of info is 0 on all processes.

Example 2

This example shows how to solve the system AX=B, where matrix A is the same diagonally dominant general tridiagonal matrix factored in Example 2 for PDDTTRF. The input and/or output values for dl, d, du, desc_a, and info in this example are the same as shown for Example 1.

Notes:

The vectors dl, d, and du, output from PDDTTRF, are stored in an internal format that depends on the number of processes. These vectors are passed, unchanged, to the solve subroutine PDDTTRS.
The contents of vector af, output from PDDTTRF, are not shown. This vector is passed, unchanged, to the solve subroutine PDDTTRS.
Because lwork = 0, PDDTTRS dynamically allocates the work area used by this subroutine.

Call Statements and Input

 ORDER = 'R'
 NPROW = 3
 NPCOL = 1
 CALL BLACS_GET (0, 0, ICONTXT)
 CALL BLACS_GRIDINIT(ICONTXT, ORDER, NPROW, NPCOL)
 CALL BLACS_GRIDINFO(ICONTXT, NPROW, NPCOL, MYROW, MYCOL)
 
             TRANSA N NRHS  DL   D   DU  IA   DESC_A   B  IB   DESC_B
               |    |   |   |    |    |   |     |      |   |     |
 CALL PDDTTRS( N , 12 , 3 , DL , D , DU , 1 , DESC_A , B , 1 , DESC_B ,
 
               AF   LAF  WORK  LWORK INFO
                |    |     |     |     |
               AF , 44 , WORK ,  0 , INFO )

[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]