The miscellaneous intrinsics for Streaming SIMD Extensions 2 (SSE2) are listed in the following table followed by their descriptions.
The prototypes for SSE2 intrinsics are in the emmintrin.h header file.
Intrinsic | Operation | Corresponding Instruction |
---|---|---|
_mm_packs_epi16 | Packed Saturation | PACKSSWB |
_mm_packs_epi32 | Packed Saturation | PACKSSDW |
_mm_packus_epi16 | Packed Saturation | PACKUSWB |
_mm_extract_epi16 | Extraction | PEXTRW |
_mm_insert_epi16 | Insertion | PINSRW |
_mm_movemask_epi8 | Mask Creation | PMOVMSKB |
_mm_shuffle_epi32 | Shuffle | PSHUFD |
_mm_shufflehi_epi16 | Shuffle | PSHUFHW |
_mm_shufflelo_epi16 | Shuffle | PSHUFLW |
_mm_unpackhi_epi8 | Interleave | PUNPCKHBW |
_mm_unpackhi_epi16 | Interleave | PUNPCKHWD |
_mm_unpackhi_epi32 | Interleave | PUNPCKHDQ |
_mm_unpackhi_epi64 | Interleave | PUNPCKHQDQ |
_mm_unpacklo_epi8 | Interleave | PUNPCKLBW |
_mm_unpacklo_epi16 | Interleave | PUNPCKLWD |
_mm_unpacklo_epi32 | Interleave | PUNPCKLDQ |
_mm_unpacklo_epi64 | Interleave | PUNPCKLQDQ |
_mm_movepi64_pi64 | Move | MOVDQ2Q |
_mm_movpi64_epi64 | Move | MOVQ2DQ |
_mm_move_epi64 | Move | MOVQ |
_mm_unpackhi_pd | Interleave | UNPCKHPD |
_mm_unpacklo_pd | Interleave | UNPCKLPD |
_mm_movemask_pd | Create mask | MOVMSKPD |
_mm_shuffle_pd | Select values | SHUFPD |
__m128i _mm_packs_epi16(__m128i a, __m128i b)
Packs the 16 signed 16-bit integers from a and b into 8-bit integers and saturates.
R0 | ... | R7 | R8 | ... | R15 |
---|---|---|---|---|---|
Signed Saturate(a0) | ... | Signed Saturate(a7) | Signed Saturate(b0) | ... | Signed Saturate(b7) |
__m128i _mm_packs_epi32(__m128i a, __m128i b)
Packs the 8 signed 32-bit integers from a and b into signed 16-bit integers and saturates.
R0 | ... | R3 | R4 | ... | R7 |
---|---|---|---|---|---|
Signed Saturate(a0) | ... | Signed Saturate(a3) | Signed Saturate(b0) | ... | Signed Saturate(b3) |
__m128i _mm_packus_epi16(__m128i a, __m128i b)
Packs the 16 signed 16-bit integers from a and b into 8-bit unsigned integers and saturates.
R0 | ... | R7 | R8 | ... | R15 |
---|---|---|---|---|---|
Unsigned Saturate(a0) | ... | Unsigned Saturate(a7) | Unsigned Saturate(b0) | ... | Unsigned Saturate(b15) |
int _mm_extract_epi16(__m128i a, int imm)
Extracts the selected signed or unsigned 16-bit integer from a and zero extends. The selector imm must be an immediate.
R0 |
---|
(imm == 0) ? a0: ( (imm == 1) ? a1: ... (imm==7) ? a7) |
__m128i _mm_insert_epi16(__m128i a, int b, int imm)
Inserts the least significant 16 bits of b into the selected 16-bit integer of a. The selector imm must be an immediate.
R0 | R1 | ... | R7 |
---|---|---|---|
(imm == 0) ? b : a0; | (imm == 1) ? b : a1; | ... | (imm == 7) ? b : a7; |
int _mm_movemask_epi8(__m128i a)
Creates a 16-bit mask from the most significant bits of the 16 signed or unsigned 8-bit integers in a and zero extends the upper bits.
R0 |
---|
a15[7] << 15 | a14[7] << 14 | ... a1[7] << 1 | a0[7] |
__m128i _mm_shuffle_epi32(__m128i a, int imm)
Shuffles the 4 signed or unsigned 32-bit integers in a as specified by imm. The shuffle value, imm, must be an immediate. See Macro Function for Shuffle for a description of shuffle semantics.
__m128i _mm_shufflehi_epi16(__m128i a, int imm)
Shuffles the upper 4 signed or unsigned 16-bit integers in a as specified by imm. The shuffle value, imm, must be an immediate. See Macro Function for Shuffle for a description of shuffle semantics.
__m128i _mm_shufflelo_epi16(__m128i a, int imm)
Shuffles the lower 4 signed or unsigned 16-bit integers in a as specified by imm. The shuffle value, imm, must be an immediate. See Macro Function for Shuffle for a description of shuffle semantics.
__m128i _mm_unpackhi_epi8(__m128i a, __m128i b)
Interleaves the upper 8 signed or unsigned 8-bit integers in a with the upper 8 signed or unsigned 8-bit integers in b.
R0 | R1 | R2 | R3 | ... | R14 | R15 |
---|---|---|---|---|---|---|
a8 | b8 | a9 | b9 | ... | a15 | b15 |
__m128i _mm_unpackhi_epi16(__m128i a, __m128i b)
Interleaves the upper 4 signed or unsigned 16-bit integers in a with the upper 4 signed or unsigned 16-bit integers in b.
R0 | R1 | R2 | R3 | R4 | R5 | R6 | R7 |
---|---|---|---|---|---|---|---|
a4 | b4 | a5 | b5 | a6 | b6 | a7 | b7 |
__m128i _mm_unpackhi_epi32(__m128i a, __m128i b)
Interleaves the upper 2 signed or unsigned 32-bit integers in a with the upper 2 signed or unsigned 32-bit integers in b.
R0 | R1 | R2 | R3 |
---|---|---|---|
a2 | b2 | a3 | b3 |
__m128i _mm_unpackhi_epi64(__m128i a, __m128i b)
Interleaves the upper signed or unsigned 64-bit integer in a with the upper signed or unsigned 64-bit integer in b.
R0 | R1 |
---|---|
a1 | b1 |
__m128i _mm_unpacklo_epi8(__m128i a, __m128i b)
Interleaves the lower 8 signed or unsigned 8-bit integers in a with the lower 8 signed or unsigned 8-bit integers in b.
R0 | R1 | R2 | R3 | ... | R14 | R15 |
---|---|---|---|---|---|---|
a0 | b0 | a1 | b1 | ... | a7 | b7 |
__m128i _mm_unpacklo_epi16(__m128i a, __m128i b)
Interleaves the lower 4 signed or unsigned 16-bit integers in a with the lower 4 signed or unsigned 16-bit integers in b.
R0 | R1 | R2 | R3 | R4 | R5 | R6 | R7 |
---|---|---|---|---|---|---|---|
a0 | b0 | a1 | b1 | a2 | b2 | a3 | b3 |
__m128i _mm_unpacklo_epi32(__m128i a, __m128i b)
Interleaves the lower 2 signed or unsigned 32-bit integers in a with the lower 2 signed or unsigned 32-bit integers in b.
R0 | R1 | R2 | R3 |
---|---|---|---|
a0 | b0 | a1 | b1 |
__m128i _mm_unpacklo_epi64(__m128i a, __m128i b)
Interleaves the lower signed or unsigned 64-bit integer in a with the lower signed or unsigned 64-bit integer in b.
R0 | R1 |
---|---|
a0 | b0 |
__m64 _mm_movepi64_pi64(__m64 a)
Returns the lower 64 bits of a as an __m64 type.
R0 |
---|
a0 |
__128i _mm_movpi64_pi64(__m128i a)
Moves the 64 bits of a to the lower 64 bits of the result, zeroing the upper bits.
R0 | R1 |
---|---|
a0 | 0X0 |
__128i _mm_move_epi64(__128i a)
Moves the lower 64 bits of a to the lower 64 bits of the result, zeroing the upper bits.
R0 | R1 |
---|---|
a0 | 0X0 |
__m128d _mm_unpackhi_pd(__m128d a, __m128d b)
Interleaves the upper DP FP values of a and b.
R0 | R1 |
---|---|
a1 | b1 |
__m128d _mm_unpacklo_pd(__m128d a, __m128d b)
Interleaves the lower DP FP values of a and b.
R0 | R1 |
---|---|
a0 | b0 |
int _mm_movemask_pd(__m128d a)
Creates a two-bit mask from the sign bits of the two DP FP values of a.
R |
---|
sign(a1) << 1 | sign(a0) |
__m128d _mm_shuffle_pd(__m128d a, __m128d b, int i)
Selects two specific DP FP values from a and b, based on the mask i. The mask must be an immediate. See Macro Function for Shuffle for a description of the shuffle semantics.