From: DNA sequences alignment in multi-GPUs: acceleration and energy payoff
Computational task performed on the GPU | Power consumption (energy in picojoules) | |
---|---|---|
Computation | Add operator using integer operands (ALU) | 0.4 |
Mul operator using fp64 operands (FPU) | 25 | |
Fused multiply-add on fp64 operands (FPU) | 40 | |
Data movement | Transition (milimeter traversed per bit) | 0.2 |
On-chip fp64 communication [1, 10, 20 mm.] | [3, 64, 250] | |
Efficient off-chip link | 500 | |
Memory access | Local access to a register file | 2 |
256-bit access to on-chip 8 KB. SRAM cache | 50 | |
DRAM read/write (for an entire cache line) | 16000 |