| Time | Implementation | Compiler | Benchmark date | SUPERCOP version |
| 358470 | optimized_nonSSE | gcc -mcpu=marvell-pj4 -O3 | 20161216 | 20161026 |
| 370506 | optimized_nonSSE | gcc -funroll-loops -mcpu=marvell-pj4 -O3 | 20161216 | 20161026 |
| 387638 | optimized_nonSSE | gcc -funroll-loops -mcpu=marvell-pj4 -O2 | 20161216 | 20161026 |
| 483080 | ref | gcc -mcpu=marvell-pj4 -O3 | 20161216 | 20161026 |
| 493571 | ref | gcc -funroll-loops -mcpu=marvell-pj4 -O3 | 20161216 | 20161026 |
| 514752 | optimized_nonSSE | gcc -mcpu=marvell-pj4 -O2 | 20161216 | 20161026 |
| 533062 | ref | gcc -funroll-loops -mcpu=marvell-pj4 -O2 | 20161216 | 20161026 |
| 579207 | optimized_nonSSE | gcc -mcpu=marvell-pj4 -Os | 20161216 | 20161026 |
| 587043 | optimized_nonSSE | gcc -funroll-loops -mcpu=marvell-pj4 -Os | 20161216 | 20161026 |
| 668903 | ref | gcc -mcpu=marvell-pj4 -O2 | 20161216 | 20161026 |
| 808580 | ref | gcc -mcpu=marvell-pj4 -Os | 20161216 | 20161026 |
| 852393 | ref | gcc -funroll-loops -mcpu=marvell-pj4 -Os | 20161216 | 20161026 |