Miscellaneous Intrinsics Using Streaming SIMD Extensions

The prototypes for Streaming SIMD Extensions (SSE) intrinsics are in the xmmintrin.h header file.

The results of each intrinsic operation are placed in registers. The information about what is placed in each register appears in the tables below, in the detailed explanation of each intrinsic. R, R0, R1, R2 and R3 represent the registers in which results are placed.

To see detailed information about an intrinsic, click on that intrinsic name in the following table.

Intrinsic
Name
Operation Corresponding SSE
Instruction
_mm_shuffle_ps Shuffle SHUFPS
_mm_unpackhi_ps Unpack High UNPCKHPS
_mm_unpacklo_ps Unpack Low UNPCKLPS
_mm_move_ss Set low word,  pass in three high values MOVSS
_mm_movehl_ps Move High to Low MOVHLPS
_mm_movelh_ps Move Low to High MOVLHPS
_mm_movemask_ps Create four-bit mask MOVMSKPS

 

__m128 _mm_shuffle_ps(__m128 a, __m128 b, unsigned int imm8)

Selects four specific SP FP values from a and b, based on the mask imm8. The mask must be an immediate. See Macro Function for Shuffle Using Streaming SIMD Extensions for a description of the shuffle semantics.

 

__m128 _mm_unpackhi_ps(__m128 a, __m128 b)

Selects and interleaves the upper two SP FP values from a and b.

R0 R1 R2 R3
a2 b2 a3 b3

 

__m128 _mm_unpacklo_ps(__m128 a, __m128 b)

Selects and interleaves the lower two SP FP values from a and b.

R0 R1 R2 R3
a0 b0 a1 b1

 

__m128 _mm_move_ss( __m128 a, __m128 b)

Sets the low word to the SP FP value of b. The upper 3 SP FP values are passed through from a.

R0 R1 R2 R3
b0 a1 a2 a3

 

__m128 _mm_movehl_ps(__m128 a, __m128 b)

Moves the upper 2 SP FP values of b to the lower 2 SP FP values of the result. The upper 2 SP FP values of a are passed through to the result.

R0 R1 R2 R3
b2 b3 a2 a3

 

__m128 _mm_movelh_ps(__m128 a, __m128 b)

Moves the lower 2 SP FP values of b to the upper 2 SP FP values of the result. The lower 2 SP FP values of a are passed through to the result.

R0 R1 R2 R3
a0 a1 b0 b1

 

int _mm_movemask_ps(__m128 a)

Creates a 4-bit mask from the most significant bits of the four SP FP values.
 

R
sign(a3)<<3 | sign(a2)<<2 | sign(a1)<<1 | sign(a0)