Jump to content

OpenCL Benchmark - CPU vs GPU / DO NOT USE ANYMORE !


mitch_de
 Share

100 posts in this topic

Recommended Posts

Thanks ! I hope all get SMILIES :) as the validate result!

PS: I cant test the dual GPU card bench - all cards should be benched. I hope some with 2 GPUs (like MacBookPro) didnt run in an error.

Link to comment
Share on other sites

CL_DEVICE_NAME: Intel® Core2 Quad CPU Q6600 @ 2.40GHz

CL_DEVICE_VENDOR: Intel

Now computing - please be patient....

time used: 15.900080

Number of elements computed: 2097152

CL_DEVICE_NAME: GeForce 8800 GT

CL_DEVICE_VENDOR: NVIDIA

Now computing - please be patient....

time used: 2.618529

Number of elements computed: 2097152

Now checking if results are valid - please be patient....

:thumbsup_anim: Validate results test passed - GPU=CPU :P

Link to comment
Share on other sites

Seems something isn't working here:

 

...........................................................

...................OpenCL Bench V 0.1 by mitch.............

.......C2D 3GHz = 30 sec vs Nvidia 9600GT = 3.10 sec.......

....... .......

........My test code (simple adds) is cpu friedly..........

.more gpu friedly+complexer code (raytracing/video encod.).

....may give much more speed advantage - at least on C2Ds..

...........................................................

CL_DEVICE_NAME: Intel® Core2 Duo CPU P8700 @ 2.53GHz

CL_DEVICE_VENDOR: Intel

Now computing - please be patient....

time used: 37.822647

Number of elements computed: 2097152

CL_DEVICE_NAME: GeForce 9400M

CL_DEVICE_VENDOR: NVIDIA

Now computing - please be patient....

time used: 12.428713

Number of elements computed: 2097152

Now checking if results are valid - please be patient....

:thumbsup_anim: Validate results test - results compute on gpu <> compute cpu

 

Sherry Haibara

 

EDIT: Second run:

...........................................................

...................OpenCL Bench V 0.1 by mitch.............

.......C2D 3GHz = 30 sec vs Nvidia 9600GT = 3.10 sec.......

....... .......

........My test code (simple adds) is cpu friedly..........

.more gpu friedly+complexer code (raytracing/video encod.).

....may give much more speed advantage - at least on C2Ds..

...........................................................

CL_DEVICE_NAME: Intel® Core2 Duo CPU P8700 @ 2.53GHz

CL_DEVICE_VENDOR: Intel

Now computing - please be patient....

time used: 37.613495

Number of elements computed: 2097152

CL_DEVICE_NAME: GeForce 9400M

CL_DEVICE_VENDOR: NVIDIA

Now computing - please be patient....

time used: 15.683911

Number of elements computed: 2097152

Now checking if results are valid - please be patient....

:P Validate results test passed - GPU=CPU :)

 

 

By the way, am I supposed to run this with no applications open?

Link to comment
Share on other sites

...........................................................

...................OpenCL Bench V 0.1 by mitch.............

.......C2D 3GHz = 30 sec vs Nvidia 9600GT = 3.10 sec.......

....... .......

........My test code (simple adds) is cpu friedly..........

.more gpu friedly+complexer code (raytracing/video encod.).

....may give much more speed advantage - at least on C2Ds..

...........................................................

CL_DEVICE_NAME: Pentium® Dual-Core CPU E5200 @ 2.50GHz (overclock 3.11ghz)

CL_DEVICE_VENDOR: Intel

Now computing - please be patient....

time used: 28.961924

Number of elements computed: 2097152

CL_DEVICE_NAME: GeForce 8800 GT

CL_DEVICE_VENDOR: NVIDIA

Now computing - please be patient....

time used: 2.580805

Number of elements computed: 2097152

Now checking if results are valid - please be patient....

:P Validate results test passed - GPU=CPU :)

logout

Link to comment
Share on other sites

CL_DEVICE_NAME: Intel® Core™2 Duo CPU E8500 @ 3.16GHz

CL_DEVICE_VENDOR: Intel

Now computing - please be patient....

time used: 28.509935

Number of elements computed: 2097152

CL_DEVICE_NAME: GeForce 8800 GT

CL_DEVICE_VENDOR: NVIDIA

Now computing - please be patient....

time used: 2.507916

Number of elements computed: 2097152

Now checking if results are valid - please be patient....

:) Validate results test passed - GPU=CPU ;)

 

i suppose that the bench stat are value in red , but what really mean valid results GPU=CPU , mitch can you explain ?

Link to comment
Share on other sites

CL_DEVICE_NAME: Intel® Core2 Quad CPU @ 2.40GHz

CL_DEVICE_VENDOR: Intel

Now computing - please be patient....

time used: 15.142966

Number of elements computed: 2097152

CL_DEVICE_NAME: GeForce 8800 GTX

CL_DEVICE_VENDOR: NVIDIA

Now computing - please be patient....

time used: 1.761477

Number of elements computed: 2097152

Now checking if results are valid - please be patient....

:rolleyes: Validate results test passed - GPU=CPU :wacko:

Link to comment
Share on other sites

i suppose that the bench stat are value in red , but what really mean valid results GPU=CPU , mitch can you explain ?

 

 

I guess that he does some benchmark computations in the gpu and in the cpu and then compares whether they gave the same result (as a number). It seems that in some cases, either because of lacking float precision or due to some flipped bit or whatnot, the results differ.

 

Also, another question to mitch: does this implementation of opencl use the cpu alongside the gpu? I thought I read somewhere that opencl was a rather generic abstraction platform where cpu cores are treated as just another computational unit. (That would mean that the gpu scores are a bit too fast to be real).

 

PS. Thanks for making the tool!!

Link to comment
Share on other sites

Number of OpenCL devices found: 2

OpenCL Device # 0 = GeForce 9600 GT

Device 0 is an: GPU with max. 1625 MHz and 64 units/cores

Now computing - please be patient....

time used: 0.753 seconds

 

OpenCL Device # 1 = Intel® Core i7 CPU 920 @ 2.67GHz

Device 1 is an: CPU with max. 3800 MHz and 8 units/cores

Now computing - please be patient....

time used: 3.137 seconds

 

EDIT: updated to v025

Link to comment
Share on other sites

Updated to V015. Hope fixed output for > 1 GPU

same speed (sure a vary of 2-5% between runs are normal)

 

to Question1:

The validate of GPU=CPU says:

compared the results which GPU has computed with that what shoud be the result.

For example, 1+1 should be 2 , not 2,1 or 3 :)

 

to Q2:

Both beches are done by OpenCL - CPU and GPU.

I ony do an validate of the results by "Normal" cpu code.

Seems that OpenCL (running on CPU if no GPU there) does an good job !

i7920 runs really fast !!!

Maybe an real MacPro 2009 with 2 * XEON "i7" will be faster on CPU than GPU - at least with an GT120 (default gpu).

 

Hope we can see some ATI´s here ;)

And of course some Geforce GT285 !!!! :)

Link to comment
Share on other sites

For example, 1+1 should be 2 , not 2,1 or 3 ;)

:hysterical:

 

thank you for that precision, I always thought that 1+1 was equal to 4 :D

 

edit:

 

last version work also

 

CL_DEVICE_NAME: Intel(R) Core(TM)2 Duo CPU	 E8500  @ 3.16GHz .....
CL_DEVICE_VENDOR: Intel
CL_DEVICE_MAX_CLOCK_FREQUENCY: 3166 MHz
CL_DEVICE_MAX_COMPUTE_UNITS: 2
Now computing - please be patient....
time used: 28.503862
Number of elements computed: 2097152

....CL_DEVICE_NAME: GeForce 8800 GT .....
CL_DEVICE_VENDOR: NVIDIA
CL_DEVICE_MAX_CLOCK_FREQUENCY: 1650 MHz
CL_DEVICE_MAX_COMPUTE_UNITS: 112
Now computing - please be patient....
time used: 2.525435
Number of elements computed: 2097152
Now checking if results are valid - please be patient....
:) Validate results test passed - GPU=CPU :)

Link to comment
Share on other sites

....CL_DEVICE_NAME: Intel(R) Core(TM)2 CPU		  6600  @ 2.40GHz .....
CL_DEVICE_VENDOR: Intel
CL_DEVICE_MAX_CLOCK_FREQUENCY: 3096 MHz
CL_DEVICE_MAX_COMPUTE_UNITS: 2
Now computing - please be patient....
time used: 29.940746
Number of elements computed: 2097152

....CL_DEVICE_NAME: GeForce 9800 GTX/9800 GTX+ .....
CL_DEVICE_VENDOR: NVIDIA
CL_DEVICE_MAX_CLOCK_FREQUENCY: 1836 MHz
CL_DEVICE_MAX_COMPUTE_UNITS: 128
Now computing - please be patient....
time used: 2.056581
Number of elements computed: 2097152
Now checking if results are valid - please be patient....
:) Validate results test passed - GPU=CPU :)

Link to comment
Share on other sites

Hi mitch. Nice tool :P

...........................................................
.................. OpenCL Bench V 0.15 by mitch ...........
...... C2D 3GHz = 30 sec vs Nvidia 9600GT = 3.10 sec ......
.......                                             .......
........My test code (simple adds) is cpu friedly..........
.more gpu friedly+complexer code (raytracing/video encod.).
... may give much more speed advantage - at least on C2Ds .
...........................................................

....CL_DEVICE_NAME: Intel® Core(tm)2 Duo CPU     E7300  @ 2.66GHz .....
CL_DEVICE_VENDOR: Intel
CL_DEVICE_MAX_CLOCK_FREQUENCY: 2666 MHz
CL_DEVICE_MAX_COMPUTE_UNITS: 2
Now computing - please be patient....
time used: 39.562576
Number of elements computed: 2097152

....CL_DEVICE_NAME: GeForce 8800 GT .....
CL_DEVICE_VENDOR: NVIDIA
CL_DEVICE_MAX_CLOCK_FREQUENCY: 1650 MHz
CL_DEVICE_MAX_COMPUTE_UNITS: 112
Now computing - please be patient....
time used: 2.386418
Number of elements computed: 2097152
Now checking if results are valid - please be patient....
 Validate results test passed - GPU=CPU 

Link to comment
Share on other sites

...........................................................

.................. OpenCL Bench V 0.15 by mitch ...........

...... C2D 3GHz = 30 sec vs Nvidia 9600GT = 3.10 sec ......

....... .......

........My test code (simple adds) is cpu friedly..........

.more gpu friedly+complexer code (raytracing/video encod.).

... may give much more speed advantage - at least on C2Ds .

...........................................................

 

....CL_DEVICE_NAME: Intel® Core2 CPU 6600 @ 2.40GHz .....

CL_DEVICE_VENDOR: Intel

CL_DEVICE_MAX_CLOCK_FREQUENCY: 2400 MHz

CL_DEVICE_MAX_COMPUTE_UNITS: 2

Now computing - please be patient....

time used: 38.881557

Number of elements computed: 2097152

 

....CL_DEVICE_NAME: GeForce 9800 GT .....

CL_DEVICE_VENDOR: NVIDIA

CL_DEVICE_MAX_CLOCK_FREQUENCY: 1715 MHz

CL_DEVICE_MAX_COMPUTE_UNITS: 112

Now computing - please be patient....

time used: 2.566827

Number of elements computed: 2097152

Now checking if results are valid - please be patient....

:) Validate results test passed - GPU=CPU :)

Link to comment
Share on other sites

Weird, V015 doesn't work here, this is the only output I get:

 

dyld: unknown required load command 0x80000022

Trace/BPT trap

 

10.5.8 vanilla, Core 2 Duo E8500, 9800GTX+ with latest drivers from Nvidia, NVEnabler.kext.

 

/Edit

 

Doh!

 

Failed the Snow Leopard test!!

Link to comment
Share on other sites

Here is my "updated" score from SL.

 

...........................................................

.................. OpenCL Bench V 0.15 by mitch ...........

...... C2D 3GHz = 30 sec vs Nvidia 9600GT = 3.10 sec ......

....... .......

........My test code (simple adds) is cpu friedly..........

.more gpu friedly+complexer code (raytracing/video encod.).

... may give much more speed advantage - at least on C2Ds .

...........................................................

 

....CL_DEVICE_NAME: Intel® Core i7 CPU 920 @ 2.67GHz .....

CL_DEVICE_VENDOR: Intel

CL_DEVICE_MAX_CLOCK_FREQUENCY: 4280 MHz

CL_DEVICE_MAX_COMPUTE_UNITS: 8

Now computing - please be patient....

time used: 3.834852

Number of elements computed: 2097152

 

....CL_DEVICE_NAME: GeForce GTX 285 .....

CL_DEVICE_VENDOR: NVIDIA

CL_DEVICE_MAX_CLOCK_FREQUENCY: 1584 MHz

CL_DEVICE_MAX_COMPUTE_UNITS: 240

Now computing - please be patient....

time used: 0.861248

Number of elements computed: 2097152

Now checking if results are valid - please be patient....

;) Validate results test passed - GPU=CPU :D

 

This program seems to multi-thread very well according to SL's CPU Usage monitor.

Link to comment
Share on other sites

Weird, V015 doesn't work here, this is the only output I get:

 

dyld: unknown required load command 0x80000022

Trace/BPT trap

 

10.5.8 vanilla, Core 2 Duo E8500, 9800GTX+ with latest drivers from Nvidia, NVEnabler.kext.

 

this tool is for 10.6 only

Link to comment
Share on other sites

This program seems to multi-thread very well according to SL's CPU Usage monitor.

Thanks for that detail !

I think the 10.6 changes "in the deep" will exspecially use much Cores better than 10.5 - even without special

source coding changes. But recompiling source with newest Xcode & using 10.6 dev framework needed , i think.

 

Also, even if the app itself is really small (

So also the Systembus Speed and RAM Speed may be computed (in the CPU time!).

So DDR3 tripple channel vs DDR3 dual channel (2 Modules same size) vs DDR2 vs RAM Latency timings vs RAM MHz ... will give different CPU time usage. GPU time should not be so much affcted by that (RAM/Systembus speed)

Link to comment
Share on other sites

...................OpenCL Bench V 0.1 by mitch.............

.......C2D 3GHz = 30 sec vs Nvidia 9600GT = 3.10 sec.......

....... .......

........My test code (simple adds) is cpu friedly..........

.more gpu friedly+complexer code (raytracing/video encod.).

....may give much more speed advantage - at least on C2Ds..

...........................................................

CL_DEVICE_NAME: Intel® Core2 Duo CPU P7350 @ 2.00GHz

CL_DEVICE_VENDOR: Intel

Now computing - please be patient....

time used: 110.848793

Number of elements computed: 2097152

CL_DEVICE_NAME: GeForce 9600M GT

CL_DEVICE_VENDOR: NVIDIA

Now computing - please be patient....

time used: 19.561712

Number of elements computed: 2097152

Now checking if results are valid - please be patient....

:thumbsup_anim: Validate results test passed - GPU=CPU :P

Link to comment
Share on other sites

Thanks.

Would you please run again with V015 - shows also GPU Mhz and GPU Units(Cores).

( i removed the old, V010 dl link now. No speed code changes only new output formating + gpu mhz / units shown.

I would also recommand to run the tool twice and look if there are big differences. If yes, run an third time and make an overall of times. Close all other apps before running it. Expecially if you have less or equal 2 GB of RAM.

 

For mobile users:

check if it makes time differences if you change powersuppy / battery. Also if you set powersettings for speed / battery safing (Energy preferences). At least orig. Macbook / Pro will throttle CPU / GPU in different sitations (powersuppy = less speed i think, energy saving settings may change also gpu(cpu throttling)

 

For desktop users:

If you use voodoopower (speedstep) please comment that at your posting. Also geekbench & XBench results are a bit lower / vary more between runs when using voodoopower(speedstep).

Link to comment
Share on other sites

ok here you are

 

.................. OpenCL Bench V 0.15 by mitch ...........

...... C2D 3GHz = 30 sec vs Nvidia 9600GT = 3.10 sec ......

....... .......

........My test code (simple adds) is cpu friedly..........

.more gpu friedly+complexer code (raytracing/video encod.).

... may give much more speed advantage - at least on C2Ds .

...........................................................

 

....CL_DEVICE_NAME: Intel® Core2 Duo CPU P7350 @ 2.00GHz .....

CL_DEVICE_VENDOR: Intel

CL_DEVICE_MAX_CLOCK_FREQUENCY: 2000 MHz

CL_DEVICE_MAX_COMPUTE_UNITS: 2

Now computing - please be patient....

time used: 116.803825

Number of elements computed: 2097152

 

....CL_DEVICE_NAME: GeForce 9600M GT .....

CL_DEVICE_VENDOR: NVIDIA

CL_DEVICE_MAX_CLOCK_FREQUENCY: 1250 MHz

CL_DEVICE_MAX_COMPUTE_UNITS: 32

Now computing - please be patient....

time used: 19.378469

Number of elements computed: 2097152

Now checking if results are valid - please be patient....

:) Validate results test passed - GPU=CPU -_-

Link to comment
Share on other sites

....CL_DEVICE_NAME: Intel® Core2 Quad CPU Q6600 @ 2.40GHz .....

CL_DEVICE_VENDOR: Intel

CL_DEVICE_MAX_CLOCK_FREQUENCY: 3600 MHz

CL_DEVICE_MAX_COMPUTE_UNITS: 4

Now computing - please be patient....

time used: 15.900147

Number of elements computed: 2097152

 

....CL_DEVICE_NAME: GeForce 8800 GTS .....

CL_DEVICE_VENDOR: NVIDIA

CL_DEVICE_MAX_CLOCK_FREQUENCY: 1300 MHz

CL_DEVICE_MAX_COMPUTE_UNITS: 96

Now computing - please be patient....

time used: 2.111204

Number of elements computed: 2097152

Now checking if results are valid - please be patient....

:( Validate results test passed - GPU=CPU :D

 

 

Yes that is a highly overclocked GTS - the fan is on 85% minimum :-)

Link to comment
Share on other sites

....CL_DEVICE_NAME: Intel® Core2 Duo CPU E8400 @ 3.00GHz .....

CL_DEVICE_VENDOR: Intel

CL_DEVICE_MAX_CLOCK_FREQUENCY: 3000 MHz

CL_DEVICE_MAX_COMPUTE_UNITS: 2

Now computing - please be patient....

time used: 36.671032

Number of elements computed: 2097152

 

....CL_DEVICE_NAME: GeForce GTX 260 .....

CL_DEVICE_VENDOR: NVIDIA

CL_DEVICE_MAX_CLOCK_FREQUENCY: 1242 MHz

CL_DEVICE_MAX_COMPUTE_UNITS: 216

Now computing - please be patient....

time used: 1.314976

Number of elements computed: 2097152

Now checking if results are valid - please be patient....

:) Validate results test passed - GPU=CPU :)

Link to comment
Share on other sites

 Share

×
×
  • Create New...