Parallel Engineering and Scientific Subroutine Library for AIX Version 2 Release 3: Guide and Reference

PDGELS and PZGELS--General Matrix Least Squares Solution

|PDGELS solves overdetermined or underdetermined real linear systems involving a real general matrix A or its transpose, using a QR or LQ factorization. It is assumed that A has full rank.

|PZGELS solves overdetermined or underdetermined complex linear |systems involving a complex general matrix A or its conjugate |transpose, using a QR or LQ factorization. It is |assumed that A has full rank.

The following options are provided:

If transa = 'N' and m >= n: find the least squares solution of an overdetermined system; that is, solve the least squares problem: minimize ||B - AX||
If transa = 'N' and m < n: find the minimum norm solution of an underdetermined system; that is, the problem is: AX = B
|For PDGELS:
- If transa = 'T' and m >= n: find the minimum norm solution of an underdetermined system; that is, the problem is A^TX = B
- If transa = 'T' and m < n: find the least squares solution of an overdetermined system; that is, solve the least squares problem: minimize ||B - A^TX||
|For PZGELS: |
- |If transa = 'C' and |m >= n: find the minimum norm solution of an |underdetermined system; that is, the problem is |A^HX = B
- |If transa = 'C' and |m < n: find the least squares solution of |an overdetermined system; that is, solve the least squares problem: |minimize |||B - A^HX|| |

In the formulas above:

A represents the global general submatrix A_{ia:ia+m-1,
ja:ja+n-1}

If transa = 'N':

B represents the global general submatrix B_{ib:ib+m-1,
jb:jb+nrhs-1} containing the right-hand sides in its columns.
X represents the global general submatrix B_{ib:ib+n-1,
jb:jb+nrhs-1} containing the solution vectors in its columns.

If transa <> 'N':

B represents the global general submatrix B_{ib:ib+n-1,
jb:jb+nrhs-1} containing the right-hand sides in its columns.
X represents the global general submatrix B_{ib:ib+m-1,
jb:jb+nrhs-1} containing the solution vectors in its columns.

Note:: No data should be moved to form A^T |or A^H; that is, the matrix A should always be stored in its untransposed form.

If (m = 0 and n = 0) or nrhs = 0, then the subroutine returns after doing some parameter checking.

See references [13] and [37].

Table 64. Data Types

A, B, `work`	Subroutine
Long-precision real	PDGELS
Long-precision complex	PZGELS

Syntax

Fortran	CALL PDGELS\|PZGELS (`transa`, `m`, `n`, `nrhs`, `a`, `ia`, `ja`, `desc_a`, `b`, `ib`, `jb`, `desc_b`, `work`, `lwork`, `info`)
C and C++	pdgels\|pzgels (`transa`, `m`, `n`, `nrhs`, `a`, `ia`, `ja`, `desc_a`, `b`, `ib`, `jb`, `desc_b`, `work`, `lwork`, `info`);

On Entry

transa

indicates the form of matrix A used in the system of equations, where:

If transa = 'N', matrix A is used.

If transa = 'T', matrix A^T is used.

|If transa = 'C', matrix |A^H is used.

Scope: global

Specified as: a single character; transa = 'N', 'T', |or 'C'.

m

is the number of rows in submatrix A used in the computation.

Scope: global

Specified as: a fullword integer; m >= 0.

n

is the number of columns in submatrix A used in the computation.

Scope: global

Specified as: a fullword integer; n >= 0.

nrhs

is the number of right-hand sides; that is the number of columns in submatricies used in the computation.

Scope: global

Specified as: a fullword integer; nrhs >= 0.

a

is the local part of the global general matrix A. This identifies the first element of the local array A. This subroutine computes the location of the first element of the local subarray used, based on ia, ja, desc_a, p, q, myrow, and mycol; therefore, the leading LOCp(ia+m-1) by LOCq(ja+n-1) part of the local array A must contain the local pieces of the leading ia+m-1 by ja+n-1 part of the global matrix.

Note:: No data should be moved to form A^T |or A^H; that is, the matrix A should always be stored in its untransposed form.

Scope: local

Specified as: an LLD_A by (at least) LOCq(N_A) array, containing numbers of the data type indicated in Table 64. Details about the block-cyclic data distribution of global matrix A are stored in desc_a.

ia

is the row index of the global matrix A, identifying the first row of the submatrix A.

Scope: global

Specified as: a fullword integer; 1 <= ia <= M_A and ia+m-1 <= M_A.

ja

is the column index of the global matrix A, identifying the first column of the submatrix A.

Scope: global

Specified as: a fullword integer; 1 <= ja <= N_A and ja+n-1 <= N_A.

desc_a

is the array descriptor for global matrix A, described in the following table:

`desc_a`	Name	Description	Limits	Scope
1	DTYPE_A	Descriptor type	DTYPE_A=1	Global
2	CTXT_A	BLACS context	Valid value, as returned by BLACS_GRIDINIT or BLACS_GRIDMAP	Global
3	M_A	Number of rows in the global matrix	If `m` = 0 or `n` = 0: M_A >= 0 Otherwise: M_A >= 1	Global
4	N_A	Number of columns in the global matrix	If `m` = 0 or `n` = 0: N_A >= 0 Otherwise: N_A >= 1	Global
5	MB_A	Row block size	MB_A >= 1	Global
6	NB_A	Column block size	NB_A >= 1	Global
7	RSRC_A	The process row of the `p` × `q` grid over which the first row of the global matrix is distributed	0 <= RSRC_A < `p`	Global
8	CSRC_A	The process column of the `p` × `q` grid over which the first column of the global matrix is distributed	0 <= CSRC_A < `q`	Global
9	LLD_A	The leading dimension of the local array	LLD_A >= max(1,LOCp(M_A))	Local

Specified as: an array of (at least) length 9, containing fullword integers.

b

is the local part of the global general matrix B, containing the right-hand sides of the system. This identifies the first element of the local array B. This subroutine computes the location of the first element of the local subarray used, based on ib, jb, desc_b, p, q, myrow, and mycol.

If transa = 'N', the leading LOCp(ib+m-1) by LOCq(jb+nrhs-1) part of the local array B must contain the local pieces of the leading ib+m-1 by jb+nrhs-1 part of the global matrix; otherwise, the leading LOCp(ib+n-1) by LOCq(jb+nrhs-1) part of the local array B must contain the local pieces of the leading ib+n-1 by jb+nrhs-1 part of the global matrix.

Scope: local

Specified as: an LLD_B by (at least) LOCq(N_B) array, containing numbers of the data type indicated in Table 64. Details about the square block-cyclic data distribution of global matrix B are stored in desc_b.

ib

is the row index of the global matrix B, identifying the first row of the submatrix B.

Scope: global

Specified as: a fullword integer; 1 <= ib <= M_B and ib + max(m, n)-1 <= M_B.

jb

is the column index of the global matrix B, identifying the first column of the submatrix B.

Scope: global

Specified as: a fullword integer; 1 <= jb <= N_B and jb+nrhs-1 <= N_B.

desc_b

is the array descriptor for global matrix B, described in the following table:

`desc_b`	Name	Description	Limits	Scope
1	DTYPE_B	Descriptor type	DTYPE_B=1	Global
2	CTXT_B	BLACS context	Valid value, as returned by BLACS_GRIDINIT or BLACS_GRIDMAP	Global
3	M_B	Number of rows in the global matrix	If (`m` = 0 and `n` = 0) or (`nrhs` = 0): M_B >= 0 Otherwise: M_B >= 1	Global
4	N_B	Number of columns in the global matrix	If (`m` = 0 and `n` = 0) or (`nrhs` = 0): N_B >= 0 Otherwise: N_B >= 1	Global
5	MB_B	Row block size	MB_B >= 1	Global
6	NB_B	Column block size	NB_B >= 1	Global
7	RSRC_B	The process row of the `p` × `q` grid over which the first row of the global matrix is distributed	0 <= RSRC_B < `p`	Global
8	CSRC_B	The process column of the `p` × `q` grid over which the first column of the global matrix is distributed	0 <= CSRC_B < `q`	Global
9	LLD_B	The leading dimension of the local array	LLD_B >= max(1,LOCp(M_B))	Local

Specified as: an array of (at least) length 9, containing fullword integers.

work

has the following meaning:

If lwork = 0, work is ignored.

If lwork <> 0, work is the work area used by this subroutine, where:

If lwork <> -1, its size is (at least) of length lwork.
If lwork = -1, its size is (at least) of length 1.

Scope: local

Specified as: an area of storage containing numbers of data type indicated in Table 64.

lwork

is the number of elements in array WORK.

Scope:

If lwork >= 0, lwork is local
If lwork = -1, lwork is global

Specified as: a fullword integer; where:

If lwork = 0, PDGELS |and PZGELS dynamically allocate the work area used by this subroutine. The work area is deallocated before control is returned to the calling program. This option is an extension to the ScaLAPACK standard.
If lwork = -1, PDGELS |and PZGELS dynamically perform a work area query and returns the minimum size of work in work₁. No computation is performed and the subroutine returns after error checking is complete.
Otherwise, it must have the following value:
lwork >= ltau + max(lwf, lws)
where:

If m >= n, then:

ltau = NUMROC(ja + min(m, n)-1, NB_A, mycol, CSRC_A, npcol)
lwf = NB_A (mpa0 + nqa0 + NB_A)
lws = max((NB_A (NB_A-1)) / 2, (nrhsqb0 + mpb0) NB_A) + (NB_A)(NB_A)

If m < n, then:

ltau = NUMROC(ia + min(m, n)-1, MB_A, myrow, RSRC_A, nprow)
lwf = MB_A (mpa0 + nqa0 + MB_A)
lws = max((MB_A (MB_A-1)) / 2, (npb0 + max(nqa0 + NUMROC( NUMROC(n + iroffb, MB_A, 0, 0, nprow), MB_A, 0, 0, lcmp), nrhsqb0)) MB_A) + (MB_A)(MB_A)

where:

lcm = ilcm(nprow, npcol)
lcmp = lcm/nprow
iroffa = mod(ia-1, MB_A)
icoffa = mod(ja-1, NB_A)
iarow = mod(RSRC_A + (ia-1)/MB_A, nprow)
iacol = mod(CSRC_A + (ja-1)/NB_A, npcol)
mpa0 = NUMROC(m+iroffa, MB_A, myrow, iarow, nprow)
nqa0 = NUMROC(n+icoffa, NB_A, mycol, iacol, npcol)
iroffb = mod(ib-1, MB_B)
icoffb = mod(jb-1, NB_B)
ibrow = mod(RSRC_B + (ib-1)/MB_B, nprow)
ibcol = mod(CSRC_B + (jb-1)/NB_B, npcol)
mpb0 = NUMROC(m+iroffb, MB_B, myrow, ibrow, nprow)
npb0 = NUMROC(n+iroffb, MB_B, myrow, ibrow, nprow)
nrhsqb0 = NUMROC(nrhs+icoffb, NB_B, mycol, ibcol, npcol)

info

See On Return.

On Return

a

is the updated local part of the global general matrix A. Matrix A is overwritten; the original input is not preserved.

Scope: local

Returned as: an LLD_A by (at least) LOCq(N_A) array, containing numbers of the data type indicated in Table 64. Details about the block-cyclic data distribution of global matrix A are stored in desc_a.

b

is the updated local part of the global general matrix B, overwritten by the solution vectors stored columnwise.

If transa = 'N' and m >= n, rows ib:ib+n-1 contain the least squares solution vectors. The residual sum of squares for each column is given by the sum of squares of elements ib+n:ib+m-1 in that column.
If transa = 'N' and m < n, rows ib:ib+n-1 contain the minimum norm solution vectors.
If transa <> 'N' and m >= n, rows ib:ib+m-1 contain the minimum norm solution vectors.
If transa <> 'N' and m < n, rows ib:ib+m-1 contain the least squares solution vectors. The residual sum of squares for each column is given by the sum of squares of elements ib+m:ib+n-1 in that column.

Scope: local

Returned as: an LLD_B by (at least) LOCq(N_B) array, containing numbers of the data type indicated in Table 64. Details about the block-cyclic data distribution of global matrix B are stored in desc_b.

work

is the work area used by this subroutine if lwork <> 0.

Scope: local

Returned as: an area of storage, where:

If lwork >= 1 or lwork = -1, then work₁ is set to the minimum lwork value and contains numbers of the data type indicated in Table 64. Except for work₁, the contents of work are overwritten on return.

info

indicates that a successful computation occurred.

Scope: global

Returned as: a fullword integer; info = 0.

Notes and Coding Rules

This subroutine accepts lowercase letters for the transa argument.
In your C program, argument info must be passed by reference.
Matrices A, B, and work must have no common elements; otherwise, results are unpredictable.
The NUMROC utility subroutine can be used to determine the values of LOCp(M_) and LOCq(N_) used in the argument descriptions above. For details, see Determining the Number of Rows and Columns in Your Local Arrays and NUMROC--Compute the Number of Rows or Columns of a Block-Cyclically Distributed Matrix Contained in a Process.
For suggested block sizes, see Coding Tips for Optimizing Parallel Performance.
The following values must be equal: CTXT_A = CTXT_B
If m >= n:
- The following block sizes must be equal: MB_A = MB_B
- The row block offset of A must be equal to the row block offset of B; that is, mod(ia-1, MB_A) = mod(ib-1, MB_B)
- In the process grid, the process row containing the first row of the submatrix A must also contain the first row of the submatrix B; that is, iarow = ibrow, where:
  
  iarow = mod(RSRC_A+(ia-1)/MB_A, p)
  ibrow = mod(RSRC_B+(ib-1)/MB_B, p)
If m < n:
- The following block sizes must be equal: NB_A = MB_B
- The column block offset of A must be equal to the row block offset of B; that is, mod(ja-1, NB_A) = mod(ib-1, MB_B)
If m < n and m <> 0, n <> 0, and nrhs <> 0:
- In the process grid, the process row containing the first row of the submatrix A must also contain the first row of the submatrix B; that is, iarow = ibrow, where:
  
  iarow = mod(RSRC_A+(ia-1)/MB_A, p)
  ibrow = mod(RSRC_B+(ib-1)/MB_B, p)
If m <> 0, n <> 0, and nrhs <> 0 and if A_{ia:ia+min(m,
n)-1, ja:ja+min(m,
n)-1} is not contained in a single block, that is, either of the following is true:

min(m, n) + mod(ia-1, MB_A) > MB_A
min(m, n) + mod(ja-1, NB_A) > NB_A
then:
- The global matrix A must be distributed using a square block-cyclic distribution; that is, MB_A = NB_A.
- The submatrix A must be aligned on a block boundary, that is,
  
  ia-1 must be a multiple of MB_A
  ja-1 must be a multiple of NB_A
- The submatrix B must be aligned on a block boundary, that is,
  
  ib-1 must be a multiple of MB_B
If lwork = -1 on any process, it must equal -1 on all processes. That is, if a subset of the processes specifies -1 for the work area size, they must all specify -1.

Function

|PDGELS solves overdetermined or underdetermined real linear systems involving a real general rectangular matrix A, or its transpose, using a QR or LQ factorization. It is assumed that A has full rank.

The following options are provided:

If transa = 'N' and m >= n: find the least squares solution of an overdetermined system; that is, solve the least squares problem: minimize ||B - AX||
If transa = 'N' and m < n: find the minimum norm solution of an underdetermined system; that is, the problem is: AX = B
|For PDGELS:
- If transa = 'T' and m >= n: find the minimum norm solution of an underdetermined system; that is, the problem is A^TX = B
- If transa = 'T' and m < n: find the least squares solution of an overdetermined system; that is, solve the least squares problem: minimize ||B - A^TX||
|For PZGELS: |
- |If transa = 'C' and |m >= n: find the minimum norm solution of an |underdetermined system; that is, the problem is |A^HX = B
- |If transa = 'C' and |m < n: find the least squares solution of |an overdetermined system; that is, solve the least squares problem: |minimize |||B - A^HX|| |

In the formulas above:

A represents the global general submatrix A_{ia:ia+m-1,
ja:ja+n-1}

If transa = 'N':

B represents the global general submatrix B_{ib:ib+m-1,
jb:jb+nrhs-1} containing the right-hand sides in its columns.
X represents the global general submatrix B_{ib:ib+n-1,
jb:jb+nrhs-1} containing the solution vectors in its columns.

If transa <> 'N':

B represents the global general submatrix B_{ib:ib+n-1,
jb:jb+nrhs-1} containing the right-hand sides in its columns.
X represents the global general submatrix B_{ib:ib+m-1,
jb:jb+nrhs-1} containing the solution vectors in its columns.

Note:: No data should be moved to form A^T |or A^H; that is, the matrix A should always be stored in its untransposed form.

If (m = 0 and n = 0) or nrhs = 0, then the subroutine returns after doing some parameter checking.

See references [13] and [37].

Error Conditions

Computational Errors

None

Resource Errors

lwork = 0 and unable to allocate work space

Input-Argument and Miscellaneous Errors

Stage 1:

DTYPE_A is invalid.
DTYPE_B is invalid.

Stage 2:

CTXT_A is invalid.

Stage 3:

|This subroutine has been called from outside the process grid.

Stage 4:

transa <>
- |'N' or 'T' for PDGELS
- |'N' or 'C' for PZGELS
m < 0
n < 0
nrhs < 0
M_A < 0 and (m = 0 or n = 0); M_A < 1 otherwise
N_A < 0 and (m = 0 or n = 0); N_A < 1 otherwise
MB_A < 1
NB_A < 1
RSRC_A < 0 or RSRC_A >= p
CSRC_A < 0 or CSRC_A >= q
ia < 1
ja < 1
M_B < 0 and ((m = 0 and n = 0) or nrhs = 0); M_B < 1 otherwise
N_B < 0 and ((m = 0 and n = 0) or nrhs = 0); N_B < 1 otherwise
MB_B < 1
NB_B < 1
RSRC_B < 0 or RSRC_B >= p
CSRC_B < 0 or CSRC_B >= q
ib < 1
jb < 1
CTXT_A <> CTXT_B

Stage 5: If m <> 0, n <> 0, and nrhs <> 0 and if A_{ia:ia+min(m,
n)-1, ja:ja+min(m,
n)-1} is not contained in a single block, that is, either of the following is true:

min(m, n) + mod(ia-1, MB_A) > MB_A

min(m, n) + mod(ja-1, NB_A) > NB_A

MB_A <> NB_A
MB_B <> NB_A

If m <> 0 and n <> 0:

ia > M_A
ja > M_A
ia+m-1 > M_A
ja+n-1 > N_A

If (m <> 0 or n <> 0) and nrhs <> 0:

ib > M_B
jb > M_B
ib+m-1 > M_B and m >= n; ib+n-1 > M_B and m < n
jb+nrhs-1 > N_B

If A_{ia:ia+min(m,
n)-1, ja:ja+min(m,
n)-1} is not contained in a single block and (m <> 0, n <> 0, and nrhs <> 0):

mod(ia-1, MB_A) <> 0
mod(ja-1, NB_A) <> 0
mod(ib-1, MB_B) <> 0

If m >= n:

MB_A <> MB_B
mod(ia-1, MB_A) <> mod(ib-1, MB_B)

If m < n:

NB_A <> MB_B
mod(ja-1, NB_A) <> mod(ib-1, MB_B)

Stage 6:

LLD_A < max(1, LOCp(M_A))
LLD_B < max(1, LOCp(M_B))

If m >= n:

In the process grid, the process row containing the first row of the submatrix A does not contain the first row of the submatrix B; that is, iarow <> ibrow, where:

iarow = mod(RSRC_A+(ia-1)/MB_A, p)
ibrow = mod(RSRC_B+(ib-1)/MB_B, p)

If m < n and (m <> 0, n <> 0, and nrhs <> 0) :

In the process grid, the process row containing the first row of the submatrix A does not contain the first row of the submatrix B; that is, iarow <> ibrow, where:

iarow = mod(RSRC_A+(ia-1)/MB_A, p)
ibrow = mod(RSRC_B+(ib-1)/MB_B, p)

In all cases:

lwork <> 0, lwork <> -1, and lwork < (minimum value) (For minimum value, see lwork parameter description.)

Stage 7:

Each of the following global input arguments are checked to determine whether its value differs from the value specified on process P₀₀:

transa differs.
m differs.
n differs.
nrhs differs.
ia differs.
ja differs.
DTYPE_A differs.
M_A differs.
N_A differs.
MB_A differs.
NB_A differs.
RSRC_A differs.
CSRC_A differs.
ib differs.
jb differs.
DTYPE_B differs.
M_B differs.
N_B differs.
MB_B differs.
NB_B differs.
RSRC_B differs.
CSRC_B differs.

Also:
lwork = -1 on a subset of processes.

|Example 1

This example illustrates the least squares solution of an overdetermined |real general system of size 4 × 3 with 5 right hand sides, using a 2 × 2 process grid.

Note:: Because lwork = 0, PDGELS dynamically allocates the work area used by this subroutine.

Call Statements and Input

ORDER = 'R'
NPROW = 2
NPCOL = 2
CALL BLACS_GET (0, 0, ICONTXT)
CALL BLACS_GRIDINIT(ICONTXT, ORDER, NPROW, NPCOL)
CALL BLACS_GRIDINFO(ICONTXT, NPROW, NPCOL, MYROW, MYCOL)
 
             TRANSA  M   N  NRHS  A  IA  JA   DESC_A   B  IB  JB   DESC_B   WORK  LWORK  INFO
               |     |   |   |    |   |   |     |      |   |   |     |       |      |     |
CALL PDGELS ( 'N' ,  4 , 3 , 5  , A , 1 , 1 , DESC_A , B , 1 , 1 , DESC_B , WORK ,  0  , INFO )

	DESC_A	DESC_B
DTYPE_	1	1
CTXT_	`icontxt`^(IOBG57)	`icontxt`^(IOBG57)
M_	4	4
N_	3	5
MB_	1	1
NB_	1	2
RSRC_	0	0
CSRC_	0	0
LLD_	See below^(EPSSL57)	See below^(EPSSL57)
Notes: `icontxt` is the output of the BLACS_GRIDINIT call. Each process should set the LLD_ as follows: LLD_A = MAX(1,NUMROC(M_A, MB_A, MYROW, RSRC_A, NPROW)) LLD_B = MAX(1,NUMROC(M_B, MB_B, MYROW, RSRC_B, NPROW)) In this example, LLD_A and LLD_B = 2 on all processes.

Global general matrix A of size 4 by 3, with block sizes 1 × 1:

B,D      0        1        2 
     *                         *
 0   |  1.00 |  -2.00 |  -1.00 |
     | ------|--------|--------|
 1   |  2.00 |    .00 |   1.00 |
     | ------|--------|--------|
 2   |  2.00 |  -4.00 |   2.00 |
     | ------|--------|--------|
 3   |  4.00 |    .00 |    .00 |
     *                         *

Global general matrix B of size 4 by 5, with block sizes 1 × 2:

B,D          0              1           2 
     *                                      *
 0   | -1.00  -2.00 | -7.00   0.00 |  -5.00 |
     | -------------|--------------|--------|
 1   |  1.00   3.00 |  4.00   3.00 |   5.00 |
     | -------------|--------------|--------|
 2   |  1.00   0.00 |  4.00   2.00 |   2.00 |
     | -------------|--------------|--------|
 3   | -2.00   4.00 |  4.00   0.00 |   4.00 |
     *                                      *

The following is the 2 × 2 process grid:

B,D  |  0 2  |  1
-----|-------|-----
0    |   P₀₀   |  P₀₁
2    |       |
-----|-------|-----
1    |   P₁₀   |  P₁₁
3    |       |

Local arrays for A:

p,q  |       0      |    1
-----|--------------|--------
 0   |  1.00  -1.00 |  -2.00 
     |  2.00   2.00 |  -4.00 
-----|--------------|--------
 1   |  2.00   1.00 |   0.00 
     |  4.00   0.00 |   0.00

Local arrays for B:

p,q  |           0         |        1
-----|---------------------|---------------
 0   | -1.00  -2.00  -5.00 |  -7.00   0.00 
     |  1.00   0.00   2.00 |   4.00   2.00 
-----|---------------------|---------------
 1   |  1.00   3.00   5.00 |   4.00   3.00 
     | -2.00   4.00   4.00 |   4.00   0.00

Output:

Global general matrix B of size 4 by 5, with block sizes 1 × 2:

B,D          0              1           2 
     *                                      *
 0   | -0.40   1.00 |  0.80   0.20 |   1.00 |
     | -------------|--------------|--------|
 1   |  0.00   1.00 |  1.50   0.00 |   1.50 |
     | -------------|--------------|--------|
 2   |  1.00   1.00 |  4.00   1.00 |   3.00 |
     | -------------|--------------|--------|
 3   | -1.00   0.00 |  2.00  -2.00 |   0.00 |
     *                                      *

The following is the 2 × 2 process grid:

B,D  |  0 2  |  1
-----|-------|-----
0    |   P₀₀   |  P₀₁
2    |       |
-----|-------|-----
1    |   P₁₀   |  P₁₁
3    |       |

Local arrays for B:

p,q  |           0         |        1
-----|---------------------|---------------
 0   | -0.40   1.00   1.00 |   0.80   0.20 
     |  1.00   1.00   3.00 |   4.00   1.00 
-----|---------------------|---------------
 1   |  0.00   1.00   1.50 |   1.50   0.00 
     | -1.00   0.00   0.00 |   2.00  -2.00

The value of info is 0 on all processes.

|Example 2

|This example illustrates the least squares solution of an underdetermined |complex general system of size 3 × 4 with 3 right hand sides, |using a 2 × 2 process grid.

|Note:: Because lwork = 0, PZGELS dynamically allocates the work |area used by this subroutine. |

|Call Statements and Input

|ORDER = 'R'
|NPROW = 2
|NPCOL = 2
|CALL BLACS_GET (0, 0, ICONTXT)
|CALL BLACS_GRIDINIT(ICONTXT, ORDER, NPROW, NPCOL)
|CALL BLACS_GRIDINFO(ICONTXT, NPROW, NPCOL, MYROW, MYCOL)
| 
|             TRANSA  M   N  NRHS  A  IA  JA   DESC_A   B  IB  JB   DESC_B   WORK  LWORK  INFO
|               |     |   |   |    |   |   |     |      |   |   |     |       |      |     |
|CALL PZGELS ( 'N' ,  3 , 4 , 3  , A , 1 , 1 , DESC_A , B , 1 , 1 , DESC_B , WORK ,  0  , INFO )

	DESC_A	DESC_B
DTYPE_	1	1
CTXT_	`icontxt`^(NT1GELS)	`icontxt`^(NT1GELS)
M_	3	4
N_	4	3
MB_	1	1
NB_	1	1
RSRC_	0	0
CSRC_	0	0
LLD_	See below^(NT2GELS)	See below^(NT2GELS)
Notes: `icontxt` is the output of the BLACS_GRIDINIT call. Each process should set the LLD_ as follows: LLD_A = MAX(1,NUMROC(M_A, MB_A, MYROW, RSRC_A, NPROW)) = 2 on P₀₀ and P₀₁ and 1 on P₁₀ and P₁₁ LLD_B = MAX(1,NUMROC(M_B, MB_B, MYROW, RSRC_B, NPROW)) = 2 on all processes

|Global general matrix A of size 3 by 4, with block sizes |1 × 1:

|B,D          0               1               2               3 
|     *                                                               *
| 0   | ( 1.00, 0.00) | (-2.00, 1.00) | (-3.00,-1.00) | ( 4.00,-3.00) |
|     | --------------|---------------|-------------- |-------------- |
| 1   | ( 1.00,-1.00) | ( 2.00, 2.00) | (-3.00, 0.00) | (-4.00,-2.00) |
|     | --------------|---------------|-------------- |-------------- |
| 2   | ( 1.00,-2.00) | (-2.00, 3.00) | (-3.00, 1.00) | ( 4.00,-1.00) |
|     | --------------|---------------|-------------- |-------------- |
|     *                                                               *
|

|Global general matrix B of size 4 by 3, with block sizes |1 × 1:

|B,D          0              1              2 
|     *                                             *
| 0   | ( 1.00, .00) | ( .00, 1.00) | ( 1.00, 1.00) |
|     | -------------|--------------|---------------|
| 1   | (-1.00,1.00) | (1.00,-1.00) | (  .00,  .00) |
|     | -------------|--------------|---------------|
| 2   |   2.00,1.00  | (1.00, 2.00) | (-1.00,-1.00) |
|     | -------------|--------------|---------------|
| 3   | (  .  , .  ) | ( .  ,  .  ) | (   . ,  .  ) |
|     *                                             *

|The following is the 2 × 2 process grid:

|B,D  |  0 2  |  1 3
|-----|-------|-----
|0    |   P₀₀   |  P₀₁
|2    |       |
|-----|-------|-----
|1    |   P₁₀   |  P₁₁
|3    |       |

|Local arrays for A:

|p,q  |              0              |                1
|-----|-----------------------------|-----------------------------
| 0   |  (1.00, 0.00) (-3.00,-1.00) |  (-2.00, 1.00) ( 4.00,-3.00)
|     |  (1.00,-2.00) (-3.00, 1.00) |  (-2.00, 3.00) ( 4.00,-1.00)
|-----|-----------------------------|-----------------------------
| 1   |  (1.00,-1.00) (-3.00, 0.00) |  ( 2.00, 2.00) (-4.00,-2.00)
|

|Local arrays for B:

|p,q  |              0              |        1
|-----|-----------------------------|---------------
| 0   | ( 1.00,  .00) ( 1.00, 1.00) |  ( 1.00,-1.00)
|     | ( 2.00, 1.00) (-1.00,-1.00) |  ( 1.00, 2.00)
|-----|-----------------------------|---------------
| 1   | (-1.00, 1.00) (  .00,  .00) |  ( 1.00,-1.00) 
|     | (  .  ,  .  ) (  .  ,  .  ) |  (  .  ,  .  )

|Output:

|Global general matrix B of size 4 by 3, with block sizes |1 × 1:

|B,D           0               1               2 
|     *                                               *
| 0   | ( -.16,  .15) | ( -.08,  .18) | (  .16, -.31) |
|     | --------------|---------------|---------------|
| 1   | (  .11,  .02) | (  .21, -.50) | ( -.38,  .65) |
|     | --------------|---------------|---------------|
| 2   | ( -.13, -.32) | (  .16,  .12) | ( -.27, -.28) |
|     | --------------|---------------|---------------|
| 3   | (  .37, -.05) | (  .04,  .06) | ( -.19,  .32) |
|     *                                               *

|The following is the 2 × 2 process grid:

|B,D  |  0 2  |  1
|-----|-------|-----
|0    |   P₀₀   |  P₀₁
|2    |       |
|-----|-------|-----
|1    |   P₁₀   |  P₁₁
|3    |       |

|Local arrays for B:

|p,q  |              0              |        1
|-----|-----------------------------|--------------
| 0   | ( -.16,  .15) (  .16, -.31) | ( -.08,  .18)
|     | ( -.13, -.32) ( -.27, -.28) | (  .16,  .12)
|-----|-----------------------------|--------------
| 1   | (  .11,  .02) ( -.38,  .65) | (  .21, -.50)
|     | (  .37, -.05) ( -.19,  .32) | (  .04,  .06)

|The value of info is 0 on all processes.

[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]