Skip to main content

Table 1 Integer half-word/quad-byte SIMD video instructions

From: CUDAMPF: a multi-tiered parallel framework for accelerating protein sequence search in HMMER on CUDA-enabled GPU

Intrinsic PTX assembly Semantics Operands and
   Optional operations
vadd2, vsub2, vadd4, vsub4 Addition/Substraction .u32.s32.sat.add
vmax2, vmin2, vmax4, vmin4 Maximum/Minimum .u32.s32.sat.add
vset2, vset4 Comparison .u32.s32.cmp.add
vavrg2, vavrg4 Average .u32.s32.sat.add
vabsdiff2, vabsdiff4 Absolute value of difference .u32.s32.sat.add
  1. Respectively, u32 and s32 represent unsigned and signed values of 32-bit; sat is used to clamp the range of operand based on its bit-width; add is for accumulation; cmp consists of 6 comparison operators: eq, ne, lt, le, gt, ge