ppc-gpu and ppc-cpu are ppc for GPU v47 ppc-asm and ppc-ori are ppc in Assembly v7 used 0.1651 x5 DOM radius with AHA options ASENS, TILT, MKOW, ANGW, LONG were disabled in the ppc for GPU run on GeForce GTX 295 1.296 GHz (device 5) and i7-920 2.67 GHz f2k muon: ppc-gpu: 54414 Device time: 16089.9 (in-kernel: 15932.7...16078.8) [ms] real 0m16.731s user 0m2.480s sys 0m0.748s ppc-cpu: 54679 real 43m44.305s user 43m44.044s sys 0m0.388s ppc-asm: 54201 real 32m1.581s user 32m1.652s sys 0m0.056s ppc-ori: 54572 real 118m14.072s user 118m13.095s sys 0m0.960s flasher: ppc-gpu: 1260983 Device time: 39761.7 (in-kernel: 39511.9...39728.3) [ms] real 0m40.012s user 0m2.872s sys 0m0.216s ppc-cpu: 1260828 real 98m16.291s user 98m15.488s sys 0m1.136s ppc-asm: 1244803 real 78m45.683s user 78m45.183s sys 0m0.924s ppc-ori: 1258907 real 204m45.798s user 204m40.235s sys 0m6.288s on Tesla C1060 1.296 GHz: Running on 30 MPs x 448 threads Kernel uses: l=0 r=35 s=3992 c=62400 f2k muon: 54549 Device time: 21925.3 (in-kernel: 21275.7...21913.2) [ms] real 0m23.414s user 0m2.656s sys 0m1.106s flasher: 1259636 Device time: 55561.8 (in-kernel: 55178.2...55526.2) [ms] real 0m56.615s user 0m2.058s sys 0m0.897s on Tesla C2050 1.147 GHz: Running on 14 MPs x 768 threads Kernel uses: l=0 r=41 s=3960 c=62400 f2k muon: 54251 Device time: 14199.1 (in-kernel: 14116.4...14200.5) [ms] real 0m17.514s user 0m2.879s sys 0m2.971s flasher: 1260565 Device time: 34892.5 (in-kernel: 34768.1...34895.8) [ms] real 0m37.505s user 0m2.037s sys 0m2.440s on GeForce GTX 480 1.401 GHz: Running on 15 MPs x 768 threads Kernel uses: l=0 r=40 s=3960 c=62400 f2k muon: 54067 Device time: 10825.0 (in-kernel: 10767.6...10828.2) [ms] real 0m12.884s user 0m2.058s sys 0m1.694s flasher: 1260004 Device time: 26533.1 (in-kernel: 26438.7...26539.8) [ms] real 0m28.030s user 0m2.072s sys 0m1.427s script: time cat mmc.1.f2k | ./gpu/ppc-gpu 5 | grep HIT | wc time cat mmc.1.f2k | ./gpu/ppc-cpu 5 | grep HIT | wc time cat mmc.1.f2k | ./appc 0 | grep HIT | wc time cat mmc.1.f2k | ./ppc 0 | grep HIT | wc time WFLA=405 ./gpu/ppc-gpu 63 20 1.e9 5 | wc time WFLA=405 ./gpu/ppc-cpu 63 20 1.e9 5 | wc time ./appc 63 20 1.e9 0 | grep -v 'HIT 63 20 ' | wc time ./ppc 63 20 1.e9 0 | wc awk 'BEGIN { mc=43*60+44.305; ma=32*60+1.581; mo=118*60+14.072; fc=98*60+16.291; fa=78*60+45.683; fo=204*60+45.798; mt=23.414; mg=16.731; mf=17.514; mx=12.884; ft=56.615; fg=40.012; ff=37.505; fx=28.030; printf "Original:\t1/%3.2f\t1/%3.2f\n", fo/fc, mo/mc printf "CPU c++:\t%3.2f\t%3.2f\n", fc/fc, mc/mc printf "Assembly:\t%3.2f\t%3.2f\n", fc/fa, mc/ma printf "GTX 295:\t%3.0f\t%3.0f\n", fc/fg, mc/mg printf "GTX/Ori:\t%3.0f\t%3.0f\n", fo/fg, mo/mg printf "C1060: \t%3.0f\t%3.0f\n", fc/ft, mc/mt printf "C2050: \t%3.0f\t%3.0f\n", fc/ff, mc/mf printf "GTX 480:\t%3.0f\t%3.0f\n", fc/fx, mc/mx }' Original: 1/2.08 1/2.70 CPU c++: 1.00 1.00 Assembly: 1.25 1.37 GTX 295: 147 157 GTX/Ori: 307 424 C1060: 104 112 C2050: 157 150 GTX 480: 210 204