Parallel Engineering and Scientific Subroutine Library for AIX Version 2 Release 3: Guide and Reference

PDPTTRF--Positive Definite Symmetric Tridiagonal Matrix Factorization

This subroutine factors the positive definite symmetric tridiagonal matrix A, stored in parallel-symmetric-tridiagonal storage mode, where, in this description, A represents the global positive definite symmetric tridiagonal submatrix

A_{ia:ia+n-1,
ia:ia+n-1}.

To solve a tridiagonal system of linear equations with multiple right-hand sides, follow the call to PDPTTRF with one or more calls to PDPTTRS. The output from this factorization subroutine should be used only as input to the solve subroutine PDPTTRS.

If n = 0, no computation is performed and the subroutine returns after doing some parameter checking. See reference [51].

Table 91. Data Types

d, e, `af`, `work`	Subroutine
Long-precision real	PDPTTRF

Syntax

Fortran	CALL PDPTTRF `(n`, `d`, `e`, `ia`, `desc_a`, `af`, `laf`, `work`, `lwork`, `info)`
C and C++	pdpttrf `(n`, `d`, `e`, `ia`, `desc_a`, `af`, `laf`, `work`, `lwork`, `info);`

On Entry

n

is the order of the positive definite symmetric tridiagonal matrix A.

Scope: global

Specified as: a fullword integer, where:

If (the process grid is p × 1 and DTYPE_A = 1) or DTYPE_A = 502, 0 <= n <= (MB_A)(p)-mod(ia-1,MB_A).
If (the process grid is 1 × p and DTYPE_A = 1) or DTYPE_A = 501, 0 <= n <= (NB_A)(p)-mod(ia-1,NB_A).

where p is the number of processes in a process grid.

d

is the local part of the global vector d. This identifies the first element of the local array D. This subroutine computes the location of the first element of the local subarray used, based on ia, desc_a, and p; therefore, the leading LOCp(ia+n-1) part of the local array D contains the local pieces of the leading ia+n-1 part of the global vector.

The global vector d contains the main diagonal of the global positive definite symmetric tridiagonal submatrix A in elements ia through ia+n-1.

Scope: local

Specified as: a one-dimensional array of (at least) length LOCp(ia+n-1). containing numbers of the data type indicated in Table 91. Details about block-cyclic data distribution of global matrix A are stored in desc_a.

On output, D is overwritten; that is, the original input is not preserved.

e

is the local part of the global vector e. This identifies the first element of the local array E. This subroutine computes the location of the first element of the local subarray used, based on ia, desc_a, and p; therefore, the leading LOCp(ia+n-1) part of the local array E contains the local pieces of the leading ia+n-1 part of the global vector.

The global vector e contains the off-diagonal of the global positive definite symmetric tridiagonal submatrix A in elements ia through ia+n-2.

Scope: local

Specified as: a one-dimensional array of (at least) length LOCp(ia+n-1), containing numbers of the data type indicated in Table 91. Details about block-cyclic data distribution of global matrix A are stored in desc_a.

On output, E is overwritten; that is, the original input is not preserved.

ia

is the row or column index of the global matrix A, identifying the first row or column of the submatrix A.

Scope: global

Specified as: a fullword integer; where:

If (the process grid is p × 1 and DTYPE_A = 1) or DTYPE_A = 502, 1 <= ia <= M_A and ia+n-1 <= M_A.
If (the process grid is 1 × p and DTYPE_A = 1) or DTYPE_A = 501, 1 <= ia <= N_A and ia+n-1 <= N_A.

desc_a

is the array descriptor for global matrix A. Because vectors are one-dimensional data structures, you may use a type-502, type-501, or type-1 array descriptor regardless of whether the process grid is p × 1 or 1 × p. For a type-502 array descriptor, the process grid is used as if it is a p × 1 process grid. For a type-501 array descriptor, the process grid is used as if it is a 1 × p process grid. For a type-1 array descriptor, the process grid is used as if it is either a p × 1 process grid or a 1 × p process grid. The following tables describe three types of array descriptors.

Table 92. Type-502 Array Descriptor

`desc_a`	Name	Description	Limits	Scope
1	DTYPE_A	Descriptor Type	DTYPE_A=502 for `p` × 1 or 1 × `p` where `p` is the number of processes in a process grid.	Global
2	CTXT_A	BLACS context	Valid value, as returned by BLACS_GRIDINIT or BLACS_GRIDMAP	Global
3	M_A	Number of rows in the global matrix	If `n` = 0: M_A >= 0 Otherwise: M_A >= 1	Global
4	MB_A	Row block size	MB_A >= 1 and 0 <= `n` <= (MB_A)(`p`)-mod(`ia`-1,MB_A)	Global
5	RSRC_A	The process row over which the first row of the global matrix is distributed	0 <= RSRC_A < `p`	Global
6	--	Not used by this subroutine.	--	--
7	--	Reserved	--	--

Specified as: an array of (at least) length 7, containing fullword integers.

Table 93. Type-1 Array Descriptor (p × 1 Process Grid)

`desc_a`	Name	Description	Limits	Scope
1	DTYPE_A	Descriptor Type	DTYPE_A = 1 for `p` × 1 where `p` is the number of processes in a process grid.	Global
2	CTXT_A	BLACS context	Valid value, as returned by BLACS_GRIDINIT or BLACS_GRIDMAP	Global
3	M_A	Number of rows in the global matrix	If `n` = 0: M_A >= 0 Otherwise: M_A >= 1	Global
4	N_A	Number of columns in the global matrix	N_A = 1
5	MB_A	Row block size	MB_A >= 1 and 0 <= `n` <= (MB_A)(`p`)-mod(`ia`-1,MB_A)	Global
6	NB_A	Column block size	NB_A >= 1	Global
7	RSRC_A	The process row over which the first row of the global matrix is distributed	0 <= RSRC_A < `p`	Global
8	CSRC_A	The process column over which the first column of the global matrix is distributed	CSRC_A = 0	Global
9	--	Not used by this subroutine.	--	--

Specified as: an array of (at least) length 9, containing fullword integers.

Table 94. Type-501 Array Descriptor

`desc_a`	Name	Description	Limits	Scope
1	DTYPE_A	Descriptor Type	DTYPE_A=501 for 1 × `p` or `p` × 1 where `p` is the number of processes in a process grid.	Global
2	CTXT_A	BLACS context	Valid value, as returned by BLACS_GRIDINIT or BLACS_GRIDMAP	Global
3	N_A	Number of columns in the global matrix	If `n` = 0: N_A >= 0 Otherwise: N_A >= 1	Global
4	NB_A	Column block size	NB_A >= 1 and 0 <= `n` <= (NB_A)(`p`)-mod(`ia`-1,NB_A)	Global
5	CSRC_A	The process column over which the first column of the global matrix is distributed	0 <= CSRC_A < `p`	Global
6	--	Not used by this subroutine.	--	--
7	--	Reserved	--	--

Specified as: an array of (at least) length 7, containing fullword integers.

Table 95. Type-1 Array Descriptor (1 × p Process Grid)

`desc_a`	Name	Description	Limits	Scope
1	DTYPE_A	Descriptor type	DTYPE_A = 1 for 1 × `p` where `p` is the number of processes in a process grid.	Global
2	CTXT_A	BLACS context	Valid value, as returned by BLACS_GRIDINIT or BLACS_GRIDMAP	Global
3	M_A	Number of rows in the global matrix	M_A = 1	Global
4	N_A	Number of columns in the global matrix	If `n` = 0: N_A >= 0 Otherwise: N_A >= 1	Global
5	MB_A	Row block size	MB_A >= 1	Global
6	NB_A	Column block size	NB_A >= 1 and 0 <= `n` <= (NB_A)(`p`)-mod(`ia`-1,NB_A)	Global
7	RSRC_A	The process row over which the first row of the global matrix is distributed	RSRC_A = 0	Global
8	CSRC_A	The process column over which the first column of the global matrix is distributed	0 <= CSRC_A < `p`	Global
9	--	Not used by this subroutine.	--	--

Specified as: an array of (at least) length 9, containing fullword integers.

af

See On Return.

laf

is the number of elements in array AF.

Scope: local

Specified as: a fullword integer, where:

If (the process grid is p × 1 and DTYPE_A = 1) or DTYPE_A = 502, laf >= 12P+3(MB_A).
If (the process grid is 1 × p and DTYPE_A = 1) or DTYPE_A = 501, laf >= 12P+3(NB_A).

where, in the formulas above, P is the actual number of processes containing data.

work

has the following meaning:

If lwork = 0, work is ignored.

If lwork <> 0, work is the work area used by this subroutine, where:

If lwork <> -1, the size of work is (at least) of length lwork.
If lwork = -1, the size of work is (at least) of length 1.

Scope: local

Specified as: an area of storage containing numbers of data type indicated in Table 91.

lwork

is the number of elements in array WORK.

Scope:

If lwork >= 0, lwork is local
If lwork = -1, lwork is global

Specified as: a fullword integer; where:

If lwork = 0, PDPTTRF dynamically allocates the work area used by the subroutine. The work area is deallocated before control is returned to the calling program. This option is an extension to the ScaLAPACK standard.
If lwork = -1, PDPTTRF performs a work area query and return the optimum size of work in work₁. No computation is performed and the subroutine returns after error checking is complete.
Otherwise, lwork >= 8P, where P is the actual number of processes containing data.

info

See On Return.

On Return

d

d is the updated local part of the global vector d, containing part of the factorization.

Scope: local

Returned as: a one-dimensional array of (at least) length LOCp(ia+n-1), containing numbers of the data type indicated in Table 91.

On output, D is overwritten; that is, the original input is not preserved.

e

e is the updated local part of the global vector e, containing part of the factorization.

Scope: local

Returned as: a one-dimensional array of (at least) length LOCp(ia+n-1), containing numbers of the data type indicated in Table 91.

On output, E is overwritten; that is, the original input is not preserved.

af

is a work area used by this subroutine and contains part of the factorization. Its size is specified by laf.

Scope: local

Returned as: a one-dimensional array of (at least) length laf, containing numbers of the data type indicated in Table 91.

work

is the work area used by this subroutine if lwork <> 0, where:

If lwork <> 0 and lwork <> -1, the size of work is (at least) of length lwork.

If lwork = -1, the size of work is (at least) of length 1.

Scope: local

Returned as: an area of storage, containing numbers of data type indicated in Table 91, where:

If lwork >= 1, the work₁ is set to the minimum lwork value needed.
If lwork = -1, the work₁ is set to the optimum lwork value needed.

Except for work₁, the contents of work are overwritten on return.

info

has the following meaning:

If info = 0, global submatrix A is positive definite, and the factorization completed successfully or the work area query completed successfully.

If 1 <= info <= p, the portion of global submatrix A stored on process info-1 and factored locally, is not positive definite. A pivot element whose value is less than or equal to a small positive number was detected.

If info > p, the portion of global submatrix A stored on process info-p-1 representing interactions with other processes, is not positive definite. A pivot element whose value is less than or equal to a small positive number was detected.

If info > 0, the factorization is completed; however, if you call PDPTTRS with these factors, the results of the computation are unpredictable.

Scope: global

Returned as: a fullword integer; info >= 0.

Notes and Coding Rules

In your C program, argument info must be passed by reference.
The output from these factorization subroutines should be used only as input to the solve subroutine PDPTTRS.
The factored matrix A is stored in an internal format that depends on the number of processes.
The scalar data specified for input argument n must be the same for both PDPTTRF and PDPTTRS.
The global vectors for d, e, and af input to PDPTTRS must be the same as the corresponding output arguments for PDPTTRF; and thus, the scalar data specified for ia, desc_a, and laf must also be the same.
In all cases, follow these rules:
- If DTYPE_A=1, then:
  - For a p × 1 process grid (where p>1), N_A=1, NB_A>=1, and CSRC_A=0.
  - For a 1 × p process grid (where p>1), M_A=1, MB_A>=1, and RSRC_A=0.
  - For a 1 × 1 process grid:
    - If N_A=1, NB_A>=1 and CSRC_A=0.
    - If M_A=1, MB_A>=1 and RSRC_A=0.
- Following are the consistent combinations of array descriptor types and process grids, where p is the number of processes in the process grid:
  
  DTYPE_A Process Grid
  501 p × 1 or 1 × p
  502 p × 1 or 1 × p
  1 p × 1 or 1 × p
To determine the values of LOCp(n) used in the argument descriptions, see Determining the Number of Rows and Columns in Your Local Arrays for descriptor type-1 or Determining the Number of Rows or Columns in Your Local Arrays for descriptor type-501 and type-502.
d, e, af, and work must have no common elements; otherwise, results are unpredictable.
The global symmetric tridiagonal matrix A must be positive definite. This subroutine uses the info argument to provide information about A, like ScaLAPACK. However, this subroutine also issues an error message, which differs from ScaLAPACK.
The global positive definite symmetric tridiagonal matrix A must be stored in parallel-symmetric-tridiagonal storage mode and distributed over a one-dimensional process grid, using block-cyclic data distribution. See the section on block-cyclically distributing a tridiagonal matrix in Matrices.
For more information on using block-cyclic data distribution, see Specifying Block-Cyclically-Distributed Matrices for the Banded Linear Algebraic Equations.
If lwork = -1 on any process, it must equal -1 on all processes. That is, if a subset of the processes specifies -1 for the work area size, they must all specify -1.
Although global matrix A may be block-cyclically distributed on a 1 × p or p × 1 process grid, the values of n, ia, MB_A (if (the process grid is p × 1 and DTYPE_A = 1) or DTYPE_A = 502), NB_A (if (the process grid is 1 × p and DTYPE_A = 1) or DTYPE_A = 501), must be chosen so that each process has at most one full or partial block of global submatrix A.
For global tridiagonal matrix A, use of the type-1 array descriptor is an extension to ScaLAPACK 1.5. If your application needs to run with both Parallel ESSL and ScaLAPACK 1.5, it is suggested that you use either a type-501 or a type-502 array descriptor for the matrix A.

DTYPE_A	Process Grid
501	`p` × 1 or 1 × `p`
502	`p` × 1 or 1 × `p`
1	`p` × 1 or 1 × `p`

Error Conditions

Computational Errors

Matrix A is not positive definite. For details, see the description of the info argument.

Resource Errors

Unable to allocate workspace

Input-Argument and Miscellaneous Errors

Stage 1:

DTYPE_A is invalid.

Stage 2:

CTXT_A is invalid.

Stage 3:

This subroutine was called from outside the process grid.

Stage 4:

Note:

In the following error conditions:

If M_A = 1 and DTYPE_A = 1, a 1 × 1 process grid is treated as a 1 × p process grid.
If N_A = 1 and DTYPE_A = 1, a 1 × 1 process grid is treated as a p × 1 process grid.

The process grid is not 1 × p or p × 1.
n < 0
ia < 1
DTYPE_A = 1 and M_A <>1 and N_A <> 1

If (the process grid is 1 × p and DTYPE_A = 1) or DTYPE_A = 501:
N_A < 0 and (n = 0); N_A < 1 otherwise
NB_A < 1
n > (NB_A)(p)-mod(ia-1,NB_A)
ia > N_A and (n > 0)
ia+n-1 > N_A and (n > 0)
CSRC_A < 0 or CSRC_A >= p

If the process grid is 1 × p and DTYPE_A = 1:
M_A <> 1
MB_A < 1
RSRC_A <> 0

If (the process grid is p × 1 and DTYPE_A = 1) or DTYPE_A = 502:
M_A < 0 and (n = 0); M_A < 1 otherwise
MB_A < 1
n > (MB_A)(p)-mod(ia-1,MB_A)
ia > M_A and (n > 0)
ia+n-1 > M_A and (n > 0)
RSRC_A < 0 or RSRC_A >= p

If the process grid is p × 1 and DTYPE_A = 1:
N_A <> 1
NB_A < 1
CSRC_A <> 0

In all cases:
laf < (minimum value) (For the minimum value, see the laf argument description.)
lwork <> 0, lwork <> -1, and lwork < (minimum value) (For the minimum value, see the lwork argument description.)

Stage 5:

Each of the following global input arguments are checked to determine whether its value is the same on all processes in the process grid:

n differs.
ia differs.
DTYPE_A differs.

If DTYPE_A = 1 on all processes:
M_A differs.
N_A differs.
MB_A differs.
NB_A differs.
RSRC_A differs.
CSRC_A differs.

If DTYPE_A = 501 on all processes:
N_A differs.
NB_A differs.
CSRC_A differs.

If DTYPE_A = 502 on all processes:
M_A differs.
MB_A differs.
RSRC_A differs.

Also:
lwork = -1 on a subset of processes.

Example

This example shows a factorization of the positive definite symmetric tridiagonal matrix A of order 12.

      *                                                            *
      | 4.0  2.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0 |
      | 2.0  5.0  2.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0 |
      | 0.0  2.0  5.0  2.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0 |
      | 0.0  0.0  2.0  5.0  2.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0 |
      | 0.0  0.0  0.0  2.0  5.0  2.0  0.0  0.0  0.0  0.0  0.0  0.0 |
      | 0.0  0.0  0.0  0.0  2.0  5.0  2.0  0.0  0.0  0.0  0.0  0.0 |
      | 0.0  0.0  0.0  0.0  0.0  2.0  5.0  2.0  0.0  0.0  0.0  0.0 |
      | 0.0  0.0  0.0  0.0  0.0  0.0  2.0  5.0  2.0  0.0  0.0  0.0 |
      | 0.0  0.0  0.0  0.0  0.0  0.0  0.0  2.0  5.0  2.0  0.0  0.0 |
      | 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  2.0  5.0  2.0  0.0 |
      | 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  2.0  5.0  2.0 |
      | 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  2.0  5.0 |
      *                                                            *

Matrix A is stored in parallel-symmetric-tridiagonal storage mode and is distributed over a 3 × 1 process grid using block-cyclic distribution.

Notes:

The vectors d and e, output from PDPTTRF, are stored in an internal format that depends on the number of processes. These vectors are passed, unchanged, to the solve subroutine PDPTTRS.
The contents of the af vector, output from PDPTTRF, is not shown. This vector is passed, unchanged, to the solve subroutine PDPTTRS.
Because lwork = 0, this subroutine dynamically allocates the work area used by this subroutine.

Call Statements and Input

ORDER = 'R'
NPROW = 3
NPCOL = 1
CALL BLACS_GET (0, 0, ICONTXT)
CALL BLACS_GRIDINIT(ICONTXT, ORDER, NPROW, NPCOL)
CALL BLACS_GRIDINFO(ICONTXT, NPROW, NPCOL, MYROW, MYCOL)
 
              N    D   E   IA   DESC_A   AF   LAF  WORK  LWORK INFO
              |    |   |    |      |     |     |     |     |    |
CALL PDPTTRF( 12 , D , E  , 1 , DESC_A , AF , 48 , WORK ,  0 , INFO )

	Desc_A
DTYPE_	502
CTXT_	`icontxt`^(CGBTOO4)
M_	12
MB_	4
RSRC_	0
Not used	--
Reserved	--
Notes: `icontxt` is the output of the BLACS_GRIDINIT call.

Global vector d with block size of 4:

B,D     0
     *     *
     | 4.0 |
     | 5.0 |
 0   | 5.0 |
     | 5.0 |
     | --- |
     | 5.0 |
     | 5.0 |
 1   | 5.0 |
     | 5.0 |
     | --- |
     | 5.0 |
     | 5.0 |
 2   | 5.0 |
     | 5.0 |
     *     *

Global vector e with block size of 4:

B,D     0
     *     *
     | 2.0 |
     | 2.0 |
 0   | 2.0 |
     | 2.0 |
     | --- |
     | 2.0 |
     | 2.0 |
 1   | 2.0 |
     | 2.0 |
     | --- |
     | 2.0 |
     | 2.0 |
 2   | 2.0 |
     |  .  |
     *     *

The following is the 3 × 1 process grid:

B,D  |   0   
-----|------- 
0    |   P₀₀
-----|------- 
1    |   P₁₀
-----|------- 
2    |   P₂₀

Local array D with block size of 4:

p,q  |  0
-----|-----
     | 4.0
     | 5.0
 0   | 5.0
     | 5.0
-----|-----
     | 5.0
     | 5.0
 1   | 5.0
     | 5.0
-----|-----
     | 5.0
     | 5.0
 2   | 5.0
     | 5.0

Local array E with block size of 4:

p,q  |  0
-----|-----
     | 2.0
     | 2.0
 0   | 2.0
     | 2.0
-----|-----
     | 2.0
     | 2.0
 1   | 2.0
     | 2.0
-----|-----
     | 2.0
     | 2.0
 2   | 2.0
     |  .

Output:

Global vector d with block size of 4:

B,D     0
     *      *
     |  .25 |
     |  .25 |
 0   |  .25 |
     | 4.0  |
     | ---- |
     |  .2  |
     |  .24 |
 1   |  .25 |
     | 4.01 |
     | ---- |
     | 4.01 |
     |  .25 |
 2   |  .24 |
     |  .2  |
     *      *

Global vector e with block size of 4:

B,D     0
     *      *
     | 2.0  |
     | 2.0  |
 0   | 2.0  |
     | 2.0  |
     | ---- |
     | 2.0  |
     | 2.0  |
 1   | 2.0  |
     | 2.0  |
     | ---- |
     |  .49 |
     |  .48 |
 2   |  .4  |
     |  .   |
     *      *

The following is the 3 × 1 process grid:

B,D  |   0   
-----|------- 
0    |   P₀₀
-----|------- 
1    |   P₁₀
-----|------- 
2    |   P₂₀

Local array D with block size of 4:

p,q  |  0
-----|------
     |  .25
     |  .25
 0   |  .25
     | 4.0
-----|------
     |  .2
     |  .24
 1   |  .25
     | 4.01
-----|------
     | 4.01
     |  .25
 2   |  .24
     |  .2

Local array E with block size of 4:

p,q  |  0
-----|------
     | 2.0
     | 2.0
 0   | 2.0
     | 2.0
-----|------
     | 2.0
     | 2.0
 1   | 2.0
     | 2.0
-----|------
     |  .49
     |  .48
 2   |  .4
     |  .

The value of info is 0 on all processes.

[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]