SGEMM and DGEMM can perform any one of the following combined matrix
computations, using scalars alpha and beta, matrices A and
B or their transposes, and matrix C:
C <-- alphaAB+betaC | C <-- alphaABT+betaC |
|
C <-- alphaATB+betaC | C <-- alphaATBT+betaC |
|
CGEMM and ZGEMM can perform any one of the following combined matrix
computations, using scalars alpha and beta, matrices A and
B, their transposes or their conjugate transposes, and matrix
C:
C <-- alphaAB+betaC | C <-- alphaABT+betaC | C <-- alphaABH+betaC |
C <-- alphaATB+betaC | C <-- alphaATBT+betaC | C <-- alphaATBH+betaC |
C <-- alphaAHB+betaC | C <-- alphaAHBT+betaC | C <-- alphaAHBH+betaC |
A, B, C, alpha, beta | Subroutine |
Short-precision real | SGEMM |
Long-precision real | DGEMM |
Short-precision complex | CGEMM |
Long-precision complex | ZGEMM |
Fortran | CALL SGEMM | DGEMM | CGEMM | ZGEMM (transa, transb, l, n, m, alpha, a, lda, b, ldb, beta, c, ldc) |
C and C++ | sgemm | dgemm | cgemm | zgemm (transa, transb, l, n, m, alpha, a, lda, b, ldb, beta, c, ldc); |
PL/I | CALL SGEMM | DGEMM | CGEMM | ZGEMM (transa, transb, l, n, m, alpha, a, lda, b, ldb, beta, c, ldc); |
If transa = 'N', A is used in the computation.
If transa = 'T', AT is used in the computation.
If transa = 'C', AH is used in the computation.
Specified as: a single character; transa = 'N', 'T', or 'C'.
If transb = 'N', B is used in the computation.
If transb = 'T', BT is used in the computation.
If transb = 'C', BH is used in the computation.
Specified as: a single character; transb = 'N', 'T', or 'C'.
If transa = 'N', it is the number of columns in matrix A.
If transa = 'T' or 'C', it is the number of rows in matrix A.
In addition:
If transb = 'N', it is the number of rows in matrix B.
If transb = 'T' or 'C', it is the number of columns in matrix B.
Specified as: a fullword integer; m >= 0.
If transa = 'N', A is used in the computation, and A has l rows and m columns.
If transa = 'T', AT is used in the computation, and A has m rows and l columns.
If transa = 'C', AH is used in the computation, and A has m rows and l columns.
Specified as: a two-dimensional array, containing numbers of the data type indicated in Table 76, where:
If transa = 'N', its size must be lda by (at least) m.
If transa = 'T' or 'C', its size must be lda by (at least) l.
If transa = 'N', lda >= l.
If transa = 'T' or 'C', lda >= m.
If transb = 'N', B is used in the computation, and B has m rows and n columns.
If transb = 'T', BT is used in the computation, and B has n rows and m columns.
If transb = 'C', BH is used in the computation, and B has n rows and m columns.
Specified as: a two-dimensional array, containing numbers of the data type indicated in Table 76, where:
If transb = 'N', its size must be ldb by (at least) n.
If transb = 'T' or 'C', its size must be ldb by (at least) m.
If transb = 'N', ldb >= m.
If transb = 'T' or 'C', ldb >= n.
The combined matrix addition and multiplication is expressed as follows, where aik, bkj, and cij are elements of matrices A, B, and C, respectively:
See references [32] and [38]. In the following three cases, no computation is performed:
Assuming the above conditions do not exist, if beta <> 1 and m is 0, then betaC is returned.
The equivalence rules, defined for matrix multiplication of A and B in Special Usage, also apply to the matrix multiplication part of the computation performed by this subroutine. You should use the equivalent rules when you want to transpose or conjugate transpose the multiplication part of the computation. When coding the calling sequences for these cases, be careful to code your matrix arguments and dimension arguments in the order indicated by the rule. Also, be careful that your input and output array C has dimensions large enough to hold the resulting matrix. See Example 4.
Unable to allocate internal work area (CGEMM and ZGEMM only).
None
This example shows the computation C<--alphaAB+betaC, where A, B, and C are contained in larger arrays A, B, and C, respectively.
TRANSA TRANSB L N M ALPHA A LDA B LDB BETA C LDC | | | | | | | | | | | | | CALL SGEMM( 'N' , 'N' , 6 , 4 , 5 , 1.0 , A , 8 , B , 6 , 2.0 , C , 7 )
* * | 1.0 2.0 -1.0 -1.0 4.0 | | 2.0 0.0 1.0 1.0 -1.0 | | 1.0 -1.0 -1.0 1.0 2.0 | A = | -3.0 2.0 2.0 2.0 0.0 | | 4.0 0.0 -2.0 1.0 -1.0 | | -1.0 -1.0 1.0 -3.0 2.0 | | . . . . . | | . . . . . | * *
* * | 1.0 -1.0 0.0 2.0 | | 2.0 2.0 -1.0 -2.0 | B = | 1.0 0.0 -1.0 1.0 | | -3.0 -1.0 1.0 -1.0 | | 4.0 2.0 -1.0 1.0 | | . . . . | * *
* * | 0.5 0.5 0.5 0.5 | | 0.5 0.5 0.5 0.5 | | 0.5 0.5 0.5 0.5 | C = | 0.5 0.5 0.5 0.5 | | 0.5 0.5 0.5 0.5 | | 0.5 0.5 0.5 0.5 | | . . . . | * *
* * | 24.0 13.0 -5.0 3.0 | | -3.0 -4.0 2.0 4.0 | | 4.0 1.0 2.0 5.0 | C = | -2.0 6.0 -1.0 -9.0 | | -4.0 -6.0 5.0 5.0 | | 16.0 7.0 -4.0 7.0 | | . . . . | * *
This example shows the computation C<--alphaABT+betaC, where A and C are contained in larger arrays A and C, respectively, and B is the same size as array B in which it is contained.
TRANSA TRANSB L N M ALPHA A LDA B LDB BETA C LDC | | | | | | | | | | | | | CALL SGEMM( 'N' , 'T' , 3 , 3 , 2 , 1.0 , A , 4 , B , 3 , 2.0 , C , 5 )
* * | 1.0 -3.0 | A = | 2.0 4.0 | | 1.0 -1.0 | | . . | * *
* * | 1.0 -3.0 | B = | 2.0 4.0 | | 1.0 -1.0 | * *
* * | 0.5 0.5 0.5 | | 0.5 0.5 0.5 | C = | 0.5 0.5 0.5 | | . . . | | . . . | * *
* * | 11.0 -9.0 5.0 | | -9.0 21.0 -1.0 | C = | 5.0 -1.0 3.0 | | . . . | | . . . | * *
This example shows the computation C<--alphaAB+betaC using complex data, where A, B, and C are contained in larger arrays, A, B, and C, respectively.
TRANSA TRANSB L N M ALPHA A LDA B LDB BETA C LDC | | | | | | | | | | | | | CALL CGEMM( 'N' , 'N' , 6 , 2 , 3 , ALPHA , A , 8 , B , 4 , BETA , C , 8 )
ALPHA = (1.0, 0.0) BETA = (2.0, 0.0) * * | (1.0, 5.0) (9.0, 2.0) (1.0, 9.0) | | (2.0, 4.0) (8.0, 3.0) (1.0, 8.0) | | (3.0, 3.0) (7.0, 5.0) (1.0, 7.0) | A = | (4.0, 2.0) (4.0, 7.0) (1.0, 5.0) | | (5.0, 1.0) (5.0, 1.0) (1.0, 6.0) | | (6.0, 6.0) (3.0, 6.0) (1.0, 4.0) | | . . . | | . . . | * *
* * | (1.0, 8.0) (2.0, 7.0) | B = | (4.0, 4.0) (6.0, 8.0) | | (6.0, 2.0) (4.0, 5.0) | | . . | * *
* * | (0.5, 0.0) (0.5, 0.0) | | (0.5, 0.0) (0.5, 0.0) | | (0.5, 0.0) (0.5, 0.0) | C = | (0.5, 0.0) (0.5, 0.0) | | (0.5, 0.0) (0.5, 0.0) | | (0.5, 0.0) (0.5, 0.0) | | . . | | . . | * *
* * | (-22.0, 113.0) (-35.0, 142.0) | | (-19.0, 114.0) (-35.0, 141.0) | | (-20.0, 119.0) (-43.0, 146.0) | C = | (-27.0, 110.0) (-58.0, 131.0) | | (8.0, 103.0) (0.0, 112.0) | | (-55.0, 116.0) (-75.0, 135.0) | | . . | | . . | * *
This example shows how to obtain the conjugate transpose of ABH.
This shows the conjugate transpose of the computation performed in Example 8 for CGEMUL, which uses the following calling sequence:
CALL CGEMUL( A , 4 , 'N' , B , 3 , 'C' , C , 4 , 3 , 2 , 3 )
You instead code the calling sequence for C<--betaC+alphaBAH, where beta = 0, alpha = 1, and the array C has the correct dimensions to receive the transposed matrix. Because beta is zero, betaC = 0. For a description of all the matrix identities, see Special Usage.
TRANSA TRANSB L N M ALPHA A LDA B LDB BETA C LDC | | | | | | | | | | | | | CALL CGEMM( 'N' , 'C' , 3 , 3 , 2 , ALPHA , B , 3 , A , 3 , BETA , C , 4 )
ALPHA = (1.0, 0.0) BETA = (0.0, 0.0) * * | (1.0, 3.0) (-3.0, 2.0) | B = | (2.0, 5.0) (4.0, 6.0) | | (1.0, 1.0) (-1.0, 9.0) | * *
* * | (1.0, 2.0) (-3.0, 2.0) | A = | (2.0, 6.0) (4.0, 5.0) | | (1.0, 2.0) (-1.0, 8.0) | | . . | * *
* * | (20.0, 1.0) (18.0, 23.0) (26.0, 23.0) | C = | (12.0, -25.0) (80.0, 2.0) (56.0, -37.0) | | (24.0, -26.0) (49.0, 37.0) (76.0, -2.0) | | . . . | * *
This example shows the computation C<--alphaATBH+betaC using complex data, where A, B, and C are the same size as the arrays A, B, and C, in which they are contained. Because beta is zero, betaC = 0. (Based on the dimensions of the matrices, A is actually a column vector, and C is actually a row vector.)
TRANSA TRANSB L N M ALPHA A LDA B LDB BETA C LDC | | | | | | | | | | | | | CALL CGEMM( 'T' , 'C' , 1 , 3 , 3 , ALPHA , A , 3 , B , 3 , BETA , C , 1 )
ALPHA = (1.0, 1.0) BETA = (0.0, 0.0) * * | (1.0, 2.0) | A = | (2.0, 5.0) | | (1.0, 6.0) | * *
* * | (1.0, 6.0) (-3.0, 4.0) (2.0, 6.0) | B = | (2.0, 3.0) (4.0, 6.0) (0.0, 3.0) | | (1.0, 3.0) (-1.0, 6.0) (-1.0, 9.0) | * *
* * C = | (86.0, 44.0) (58.0, 70.0) (121.0, 55.0) | * *