Parallel Engineering and Scientific Subroutine Library for AIX Version 2 Release 3: Guide and Reference

PSCFT2 and PDCFT2--Complex Fourier Transforms in Two Dimensions

These subroutines compute the mixed-radix two-dimensional discrete Fourier transform of complex data:

FFT Graphic

for:

k1 = 0, 1, ..., n1-1

k2 = 0, 1, ..., n2-1

where:

FFT Graphic

and where:

x_j1,j2 are elements of array X.

y_k1,k2 are elements of array Y.

Isign is + or - (determined by argument isign).

scale is a scalar value.

For scale = 1 and isign being positive, you obtain the discrete Fourier transform. For scale = 1/((n1)(n2)) and isign being negative, you obtain the inverse Fourier transform.

See references [1] and [3].

Table 109. Data Types

`X`, `Y`	`scale`	Subroutine
Short-precision complex	Short-precision real	PSCFT2
Long-precision complex	Long-precision real	PDCFT2

is the local array X, containing the two-dimensional data to be transformed that has been block-column distributed over a 1 × q process grid, where q is the number of processes. (The value of ldx is set in the IP array.)

Scope: local

Specified as: an array of (at least) length ldx × LOCq(n2), containing numbers of the data type indicated in Table 109. This array must be aligned on a doubleword boundary.

y

See On Return.

n1

is the length of the first dimension of the two-dimensional data in the array to be transformed.

Scope: global

Specified as: a fullword integer; n1 <= 37748736 and must be one of the values listed in Figure 9.

n2

is the length of the second dimension of the two-dimensional data in the array to be transformed.

Scope: global

Specified as: a fullword integer; n2 <= 37748736 and must be one of the values listed in Figure 9.

isign

controls the direction of the transform, determining the sign, isign, of the exponent of W_n, where:

If isign = positive value, Isign = + (transforming time to frequency).

If isign = negative value, Isign = - (transforming frequency to time).

Scope: global

Specified as: a fullword integer; where isign > 0 or isign < 0.

scale

is the scaling constant scale.

Scope: global

Specified as: a number of the data type indicated in Table 109, where scale > 0.0 or scale < 0.0.

icontxt

is the BLACS context parameter.

Scope: global

Specified as: the fullword integer that was returned by a prior call to BLACS_GRIDINIT or BLACS_GRIDMAP.

ip

is an array of parameters, IP(i), where:

IP(1) indicates whether the default values for ip are used or you set the values for ip.
If IP(1) = 0, then the following default values are used:
- y is returned in transposed form; that is, global y has dimensions n2 × n1
- ldx, the leading dimension of the array specified for X, equals n1
- ldy, the leading dimension of the array specified for Y, equals n2
The remaining parameters of the array IP are ignored.
If IP(1) <> 0, then you set the remaining values of ip to indicate whether y is stored in normal or transposed form, and indicate values for ldx and ldy.
IP(2) indicates whether y is to be stored in normal or transposed form.
If IP(2) = 0, then y is to be stored in transposed form on output.
If IP(2) = 1, then y is to be stored in normal form on output.
IP(3-19) are reserved.
IP(20) indicates the value of the leading dimension, ldx, of the array specified for X, where:
If IP(20) = 0, then ldx = n1.
If IP(20) <> 0, then ldx is this value of IP(20).
IP(21) indicates the value of the leading dimension, ldy, of the array specified for Y, where:
If IP(21) = 0 and y is to be stored in normal form, then ldy = n1.
If IP(21) = 0 and y is to be stored in transposed form, then ldy = n2.
If IP(21) <> 0, then ldy is this value of IP(21).
IP(22-40) are reserved.

Scope: global

Specified as: a one-dimensional array of (at least) length 40, containing fullword integers, where:

IP(1) is any integer

IP(2) = 0 or 1

IP(20) >= n1 or IP(20) = 0

IP(21) >= n1 (for normal form) or IP(21) = 0

IP(21) >= n2 (for transposed form) or IP(21) = 0

On Return

y

is the local array Y that is block-column distributed and contains the results of the computation, where:

If IP(1) = 0, the local array Y is stored in transposed form and has dimensions n2 × LOCq(n1).

If IP(1) <> 0 and IP(2) = 0, the local array Y is stored in transposed form and has dimensions ldy × LOCq(n1).

If IP(1) <> 0 and IP(2) = 1, the local array Y is stored in normal form and has dimensions ldy × LOCq(n2).

Scope: local

Returned as: an ldy × LOCq(n2) array (for normal form) or an ldy × LOCq(n1) array (for transposed form), containing the numbers of the data type indicated in Table 109. This array must be aligned on a doubleword boundary.

Notes and Coding Rules

You may specify the same array for both X and Y. In this case, output overwrites input. If you specify different arrays X and Y, they must have no common elements; otherwise, results are unpredictable.
For the output array Y, these subroutines may use any extra space available when ldy is greater than its minimum value.
For more information on LOCq(_) and how sequences are block-column distributed, see Two-Dimensional Sequence.
In general, distributing your data evenly provides the best work load balance among the processes and allows the use of the most efficient collective communication. However, for your specific problem size and number of processes available, experimentation is necessary to achieve optimal performance.
An example of the use of this subroutine in a thermal diffusion application program is shown in Appendix B, Sample Programs. See subroutine fourier in Module Fourier.

Error Conditions

Computational Errors

None

Resource Errors

Unable to allocate work space

Input-Argument and Miscellaneous Errors

Stage 1:

icontxt is invalid

Stage 2:

Process grid is not 1 × q
The subroutine was called from outside the process grid.

Stage 3:

n1 > 37748736
n2 > 37748736
The length of n1 or n2 is not an allowable transform length.
isign = 0
scale = 0.0
IP(1) <> 0 and IP(2) <> 0 or 1
IP(1) <> 0 and IP(20) <> 0 and IP(20) < n1 (that is, ldx < n1)
IP(1) <> 0 and IP(2) = 1 (for normal mode) and IP(21) <> 0 and IP(21) < n1 (that is, ldy < n1)
IP(1) <> 0 and IP(2) = 0 (for transpose mode) and IP(21) <> 0 and IP(21) < n2 (that is, ldy < n2)

Example 1

This example shows how to compute a two-dimensional transform. In this example, the IP array is set to 0, which means array Y is returned in transposed form, ldx=n1, and ldy=n2. The data is block-column distributed over a 1 × 2 process grid. The arrays are declared as follows:

  COMPLEX*16 X(0:7,0:2), Y(0:5,0:3)
  INTEGER*4  IP(40)
  REAL*8     SCALE

Call Statements and Input

ORDER = 'R'
NPROW = 1
NPCOL = 2
CALL BLACS_GET(0, 0, ICONTXT)
CALL BLACS_GRIDINIT(ICONTXT, ORDER, NPROW, NPCOL)
CALL BLACS_GRIDINFO(ICONTXT, NPROW, NPCOL, MYROW, MYCOL)
IP(1) = 0
 
             X   Y   N1  N2  ISIGN     SCALE       ICONTXT   IP
             |   |   |   |     |         |            |      |
CALL PDCFT2( X , Y , 8 , 6 ,  -1  , 1.0D0/48.0D0 , ICONTXT , IP)

Global matrix X of order 8 × 6:

B,D                    0                                     1
     *                                                                         *
     |  (48.0,0.0)  (0.0,0.0)  (0.0,0.0)  |    (0.0,0.0)  (0.0,0.0)  (0.0,0.0) |
     |   (0.0,0.0)  (0.0,0.0)  (0.0,0.0)  |    (0.0,0.0)  (0.0,0.0)  (0.0,0.0) |
     |   (0.0,0.0)  (0.0,0.0)  (0.0,0.0)  |    (0.0,0.0)  (0.0,0.0)  (0.0,0.0) |
     |   (0.0,0.0)  (0.0,0.0)  (0.0,0.0)  |    (0.0,0.0)  (0.0,0.0)  (0.0,0.0) |
 0   |   (0.0,0.0)  (0.0,0.0)  (0.0,0.0)  |    (0.0,0.0)  (0.0,0.0)  (0.0,0.0) |
     |   (0.0,0.0)  (0.0,0.0)  (0.0,0.0)  |    (0.0,0.0)  (0.0,0.0)  (0.0,0.0) |
     |   (0.0,0.0)  (0.0,0.0)  (0.0,0.0)  |    (0.0,0.0)  (0.0,0.0)  (0.0,0.0) |
     |   (0.0,0.0)  (0.0,0.0)  (0.0,0.0)  |    (0.0,0.0)  (0.0,0.0)  (0.0,0.0) |
     *                                                                         *

The following is the 1 × 2 process grid:

B,D  |   0   | 1 
-----|-------|-----
0    |   P₀₀   |  P₀₁

Local arrays for X:

p,q  |                 0                  |                  1
-----|------------------------------------|------------------------------------
     |  (48.0,0.0)  (0.0,0.0)  (0.0,0.0)  |    (0.0,0.0)  (0.0,0.0)  (0.0,0.0)
     |   (0.0,0.0)  (0.0,0.0)  (0.0,0.0)  |    (0.0,0.0)  (0.0,0.0)  (0.0,0.0)
     |   (0.0,0.0)  (0.0,0.0)  (0.0,0.0)  |    (0.0,0.0)  (0.0,0.0)  (0.0,0.0)
     |   (0.0,0.0)  (0.0,0.0)  (0.0,0.0)  |    (0.0,0.0)  (0.0,0.0)  (0.0,0.0)
 0   |   (0.0,0.0)  (0.0,0.0)  (0.0,0.0)  |    (0.0,0.0)  (0.0,0.0)  (0.0,0.0)
     |   (0.0,0.0)  (0.0,0.0)  (0.0,0.0)  |    (0.0,0.0)  (0.0,0.0)  (0.0,0.0)
     |   (0.0,0.0)  (0.0,0.0)  (0.0,0.0)  |    (0.0,0.0)  (0.0,0.0)  (0.0,0.0)
     |   (0.0,0.0)  (0.0,0.0)  (0.0,0.0)  |    (0.0,0.0)  (0.0,0.0)  (0.0,0.0)

Output

Global matrix for Y:

B,D                         0                                                1
     *                                                                                               *
     |   (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  |    (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0) |
     |   (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  |    (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0) |
     |   (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  |    (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0) |
 0   |   (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  |    (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0) |
     |   (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  |    (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0) |
     |   (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  |    (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0) |
     *                                                                                               *

The following is the 1 × 2 process grid:

B,D  |   0   | 1 
-----|-------|-----
0    |   P₀₀   |  P₀₁

Local matrix for Y:

p,q  |                      0                        |                       1
-----|-----------------------------------------------|-----------------------------------------------
     |   (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  |    (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)
     |   (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  |    (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)
     |   (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  |    (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)
 0   |   (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  |    (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)
     |   (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  |    (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)
     |   (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  |    (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)

Example 2

This example shows how to compute a two-dimensional transform. This is an example of uneven block-column distribution over a 1 × 3 process grid. In this example, the IP array is set to 0, which means array Y is returned in transposed form, ldx=n1, and ldy=n2. The arrays are declared as follows:

  COMPLEX*16 X(0:7,0:2), Y(0:7,0:2)
  INTEGER*4  IP(40)
  REAL*8     SCALE

Call Statements and Input

ORDER = 'R'
NPROW = 1
NPCOL = 3
CALL BLACS_GET(0, 0, ICONTXT)
CALL BLACS_GRIDINIT(ICONTXT, ORDER, NPROW, NPCOL)
CALL BLACS_GRIDINFO(ICONTXT, NPROW, NPCOL, MYROW, MYCOL)
IP(1) = 0
 
             X   Y   N1  N2  ISIGN     SCALE       ICONTXT   IP
             |   |   |   |     |         |            |      |
CALL PDCFT2( X , Y , 8 , 8 ,   1  , 1.0D0/16.0D0 , ICONTXT , IP)

Global matrix X of order 8 × 8:

B,D                    0                                     1                              2
   *                                                                                                     *
   |  (0.0,98.0) (67.0,27.0) (67.0,82.0) | (84.0,99.0) (26.0,41.0) (24.0,15.0) | (27.0,55.0)  (48.0,9.0) |
   | (13.0,49.0) (93.0,91.0)  (0.0,12.0) | (52.0,88.0)  (4.0,84.0) (98.0,57.0) | (43.0,89.0) (89.0,27.0) |
   | (75.0,26.0) (38.0,52.0)  (38.0,1.0) |  (9.0,23.0) (73.0,26.0) (72.0,80.0) | (76.0,62.0)  (90.0,0.0) |
   |  (45.0,9.0) (51.0,46.0)  (6.0,68.0) | (65.0,30.0) (32.0,41.0)  (75.0,3.0) | (47.0,84.0)  (6.0,41.0) |
 0 | (53.0,94.0) (83.0,94.0) (41.0,86.0) | (41.0,35.0) (63.0,53.0) (65.0,53.0) | (23.0,15.0)  (90.0,2.0) |
   |  (21.0,7.0)   (3.0,5.0) (68.0,62.0) | (70.0,51.0) (75.0,46.0)  (7.0,49.0) | (27.0,21.0) (50.0,70.0) |
   |  (4.0,50.0)  (5.0,76.0) (58.0,73.0) | (91.0,59.0) (99.0,28.0) (63.0,95.0) | (35.0,71.0) (51.0,93.0) |
   | (67.0,38.0) (52.0,77.0) (93.0,72.0) | (76.0,84.0) (36.0,17.0) (88.0,74.0) | (16.0,13.0) (31.0,23.0) |
   *                                                                                                     *

The following is the 1 × 3 process grid:

B,D  |   0   |   1   |   2   
-----|-------|-------|------- 
0    |   P₀₀   |   P₀₁   |   P₀₂

Local arrays for X:

p,q  |                 0                  |                  1                     |             2
-----|------------------------------------|----------------------------------------|--------------------------
     | (0.0,98.0) (67.0,27.0) (67.0,82.0) |  (84.0,99.0)  (26.0,41.0)  (24.0,15.0) | (27.0,55.0)   (48.0,9.0)
     |(13.0,49.0) (93.0,91.0)  (0.0,12.0) |  (52.0,88.0)   (4.0,84.0)  (98.0,57.0) | (43.0,89.0)  (89.0,27.0)
     |(75.0,26.0) (38.0,52.0)  (38.0,1.0) |   (9.0,23.0)  (73.0,26.0)  (72.0,80.0) | (76.0,62.0)   (90.0,0.0)
     | (45.0,9.0) (51.0,46.0)  (6.0,68.0) |  (65.0,30.0)  (32.0,41.0)   (75.0,3.0) | (47.0,84.0)   (6.0,41.0)
 0   |(53.0,94.0) (83.0,94.0) (41.0,86.0) |  (41.0,35.0)  (63.0,53.0)  (65.0,53.0) | (23.0,15.0)   (90.0,2.0)
     | (21.0,7.0)   (3.0,5.0) (68.0,62.0) |  (70.0,51.0)  (75.0,46.0)   (7.0,49.0) | (27.0,21.0)  (50.0,70.0)
     | (4.0,50.0)  (5.0,76.0) (58.0,73.0) |  (91.0,59.0)  (99.0,28.0)  (63.0,95.0) | (35.0,71.0)  (51.0,93.0)
     |(67.0,38.0) (52.0,77.0) (93.0,72.0) |  (76.0,84.0)  (36.0,17.0)  (88.0,74.0) | (16.0,13.0)  (31.0,23.0)

Output

Global matrix for Y:

B,D                         0                                       1                                2
   *                                                                                                                *
   |(198.6,200.1)  (-10.6,9.8)     (0.8,7.2)  |   (5.8,-5.2)   (11.2,9.1) (-38.3,-18.7) |(-10.2,-1.9)   (14.0,12.6) |
   |  (-0.3,-6.8)  (19.3,-18.7)  (28.7,-3.6)  |   (-7.2,2.5)   (1.5,14.6) (-22.0,-20.7) |(29.8,-15.0)   (-10.7,0.8) |
   |  (11.3,-6.2)  (-24.0,-8.1)   (8.6,11.6)  |  (-29.9,6.5)  (13.7,13.5)  (-16.7,-4.4) |(-26.6,-0.8)    (-3.3,9.5) |
   |   (5.7,17.1)    (3.7,-7.0)  (-2.5,13.9)  |(-19.5,-15.9) (-18.4,20.1)   (11.6,-1.8) | (-0.3,-8.2)   (26.8,30.0) |
0  | (-29.8,-3.4)    (-0.5,7.4) (-17.1,27.5)  |  (18.5,32.6)    (9.4,9.6)    (7.6,-8.0) |(-13.1,13.9) (-26.6,-16.5) |
   |  (-10.2,1.6)   (-5.0,28.8)  (-5.0,25.0)  |   (5.0,12.1)  (-13.5,9.9)     (2.5,0.6) |  (0.0,-5.6)  (-11.8,-8.3) |
   | (-8.7,-13.6)   (10.0,11.1)    (0.6,9.4)  | (12.2,-21.2)  (-9.3,-0.9)  (14.5,-15.6) |  (2.4,11.1)   (-22.7,0.2) |
   | (-27.7,-3.1) (-21.8,-21.3)  (-22.6,6.0)  |   (0.2,11.6)   (-1.6,6.6)   (-7.2,-0.4) |  (0.5,25.6)   (20.3,23.8) |
   *                                                                                                                *

The following is the 1 × 3 process grid:

B,D  |   0   |   1   |   2   
-----|-------|-------|------- 
0    |   P₀₀   |   P₀₁   |   P₀₂

Local matrix for Y:

p,q|                      0                   |                     1                   |              2
---|------------------------------------------|-----------------------------------------|--------------------------
   |(198.6,200.1)   (-10.6,9.8)    (0.8,7.2)  |   (5.8,-5.2)   (11.2,9.1) (-38.3,-18.7) |(-10.2,-1.9)   (14.0,12.6)
   |  (-0.3,-6.8)  (19.3,-18.7)  (28.7,-3.6)  |   (-7.2,2.5)   (1.5,14.6) (-22.0,-20.7) |(29.8,-15.0)   (-10.7,0.8)
   |  (11.3,-6.2)  (-24.0,-8.1)   (8.6,11.6)  |  (-29.9,6.5)  (13.7,13.5)  (-16.7,-4.4) |(-26.6,-0.8)    (-3.3,9.5)
   |   (5.7,17.1)    (3.7,-7.0)  (-2.5,13.9)  |(-19.5,-15.9) (-18.4,20.1)   (11.6,-1.8) | (-0.3,-8.2)   (26.8,30.0)
0  | (-29.8,-3.4)    (-0.5,7.4) (-17.1,27.5)  |  (18.5,32.6)    (9.4,9.6)    (7.6,-8.0) |(-13.1,13.9) (-26.6,-16.5)
   |  (-10.2,1.6)   (-5.0,28.8)  (-5.0,25.0)  |   (5.0,12.1)  (-13.5,9.9)     (2.5,0.6) |  (0.0,-5.6)  (-11.8,-8.3)
   | (-8.7,-13.6)   (10.0,11.1)    (0.6,9.4)  | (12.2,-21.2)  (-9.3,-0.9)  (14.5,-15.6) |  (2.4,11.1)   (-22.7,0.2)
   | (-27.7,-3.1) (-21.8,-21.3)  (-22.6,6.0)  |   (0.2,11.6)   (-1.6,6.6)   (-7.2,-0.4) |  (0.5,25.6)   (20.3,23.8)

[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]

Fortran	CALL PSCFT2 \| PDCFT2 (`x`, `y`, `n1`, `n2`, `isign`, `scale`, `icontxt`, `ip`)
C and C++	pscft2 \| pdcft2 (`x`, `y`, `n1`, `n2`, `isign`, `scale`, `icontxt`, `ip`);

Parallel Engineering and Scientific Subroutine Library for AIX Version 2 Release 3: Guide and Reference

PSCFT2 and PDCFT2--Complex Fourier Transforms in Two Dimensions

Syntax

On Entry

On Return

Notes and Coding Rules

Error Conditions

Computational Errors

Resource Errors

Input-Argument and Miscellaneous Errors

Example 1

Call Statements and Input

Output

Example 2

Call Statements and Input

Output