Jump to content
78 posts in this topic

Recommended Posts

http://unigine.com/products/valley/

 

I have not tested already but for sure interesting, like new Heaven (4.0) Benchmark.

 

 

To ALL: Please submit at LEAST once the PRESET BASIC result! (AMD GPUS if problems, preset BASIC without AA).

Otherwise your results cant be compared to others - and if someone wants to compare he has to run same settings. Takes time & not all users (tft restricted) can use 1600+ resolutions!

You are free to submit also other , custom setup results if you posted BASIC preset.

AMD (hackintosh users with non official AMD 6xxx/7xxx) may disable AA*2 to avoid artifacts, as in heaven 4.0.

 

Results:

AMD Radeon HD 7870 (no AA!) : 81fps windowed

GTX 580, BASIC(no AA!) : 80,3 fps fullscreen

AMD 6870 12D68, BASIC(no AA!) : 78,7 fps, 1920x1080 (no AA) = 48,7 fps

AMD Sapphire Radeon HD 7970 (no AA!) : 77,8 fps fullscreen (windowed mode gave minimal less fps)

AMD Radeon 6870 BASIC : 73,1 fps (no AA!)

AMD Radeon 6850 BASIC : 62,4 fps (no AA!)

Gigabyte GTX 650 Ti, BASIC: 55 fps

EVGA GTX 660 2GB, BASIC ; 43,2 fps,

Asus GTX275, BASIC: 28,5 fps

Nvidia 9600 GT, BASIC : 17 fps

Nvidia GT 430, BASIC : 15,2 fps

 

Bildschirmfoto 2013-02-15 um 23.07.19.jpg

 

EDIT: Some GPUs testet running Windows (test german HW magazine) - preset EXTREME HD

 

Bildschirmfoto 2013-02-19 um 01.09.47.jpg

EVGA GTX 660 2GB, 304.00.05f02 driver on 10.8.2

 

Basic Preset - 10.8.2

Valley Basic GTX 660.png

 

Ultra, 1280x1024 full screen, 2xAA

Valley Ultra GTX 660.png

 

This doesn't make much sense, the Ultra result should be slower...?!

 

I'm probably CPU bottlenecked.

 

Basic Preset OpenGL - Windows 7

Valley_Windows_Basic_GTX_660.png

Max FPS seems locked to my refresh rate even though vsync is disabled.

 

Ultra, 1280x1024 full screen, 2xAA, OpenGL - Windows 7

Valley_Windows_Ultra_GTX_660.png

 

lol, OS X scores higher than Windows with the same settings.

GPU temps read the same on both OS - it slowly climbs to 82 degrees celsius and stays there.

I dont think your cpu is the bottleneck. If cpu load is

 

You tested second (Quality Ultra) in fullscreen (same res). Try windowed. Sometimes windowed vs fullscreen gives different fps even same res!

Also, you have very much VRAM. Quality setting does (normaly): Polygon count, Textures and Shaders.

If it does mostly / only changing texture res, then only VRAM usage is much more - if you have enough VRAM than no fps difference like it would use more poligones or more usage of shaders.

Ok, just another ATI 6870 (XFX Dual Fan), but running only on x8 PCIe 2.0 slot.Basic, with disabled AA:Cheers!

post-734613-0-76443100-1361249800_thumb.jpg

 

Ok, additional values with the oclBandwidthTest-Tool (Thanks, mitch!)

 

 

Running on...

ATI Radeon Barts XT Prototype

Quick Mode

 

Host to Device Bandwidth, 1 Device(s), Paged memory, direct access

Transfer Size (Bytes) Bandwidth(MB/s)

33554432 2618.5

 

Device to Host Bandwidth, 1 Device(s), Paged memory, direct access

Transfer Size (Bytes) Bandwidth(MB/s)

33554432 2940.6

 

Device to Device Bandwidth, 1 Device(s)

Transfer Size (Bytes) Bandwidth(MB/s)

33554432 105993.3

 

[oclBandwidthTest] test results...

PASSED

Valley.tiff

Edited by skinmunster
converted to jpeg and added your image to post

I dont think your cpu is the bottleneck. If cpu load is

 

Really? I don't know how to see CPU load while running the benchmark on OS X. I'll try on Windows.

 

I disagree, it must be a CPU bottleneck...otherwise how could a GTX 650 Ti possibly beat my score by ~500 points?!

Small cpu speed diffs (same cpu type but diff in Ghz) should not have an effect on that bench.

For sure C2D vs i7 will give an significant (>10%) fps diff - at least on very fast gpus.

 

The other with GTX 650 TI (is slower than GTX 660 - normaly, only GTX 660 TI is faster than GTX 660) has i7 cpu 2600k - GHz unknown.

 

Perhaps some differences in CPU PCIe GPU bandwidth?

 

You could compare your OCLbandwidth test (testes also device to device = VRAM bandwidth)

(found yours :) )

Running on...
GeForce GTX 660
Quick Mode
Host to Device Bandwidth, 1 Device(s), Paged memory, direct access
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 4227.2
Device to Host Bandwidth, 1 Device(s), Paged memory, direct access
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 5542.3
Device to Device Bandwidth, 1 Device(s)
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 81757.6
[oclBandwidthTest] test results...

 

But i think there is nothing big diff to the GTX 650 TI. Your devicehost values (PCIe) and VRAM speed (81 Gb/s) look good.

In practice max. possible (measured) is about 11 GB/sec for DEVHOST (If chipset and GPU type can handle PCIe 3.0).

But that values (DEVHOST) should have no effect for gaming speed (other to VRAM speed!) is its at least above 1 GB/sec - DEVHOST speed is more an bottleneck for gpu computing.

 

Also AGPM diff (gpu load power management settings) may give differences.

 

Also you both could compare result with higher "pressure" to the gpu task (less cpu dependence) by setting up AA*8 and higher quality .

Then GPU has much more work - CPU same work = less cpu diff in result.

Dont know if both could handle the other fixed presets (higher res also).

oclBandwidthTest.zip

To ALL: Please submit at LEAST once the PRESET BASIC result! (AMD GPUS if problems, preset BASIC without AA).

Otherwise your results cant be compared to others - and if someone wants to compare he has to run same settings. Takes time & not all users (tft restricted) can use 1600+ resolutions!

You are free to submit also other , custom setup results if you posted BASIC preset.

  • 2 weeks later...

Basic with AA off. But fullscreen does give about a 100 more points than windowed. Also always at same spot where the boulders are and starting to rain it stutters for half a second dropping min FPS from 30 which is at first startup down to 8 FPS. Tried re-downloading, same thing :unsure:

Screen Shot 2013-02-27 at 6.22.24 PM.png

Oh, second GPU not a factor here, exact same results either way, only boosts in openCL

Yep, fullscreen vs windowed mode can (with same other settings) make sometimes different fps speeds. Depends on GPu and drivers.

Your system, with AMD 7970 gets some really low minimal fps ( around 8 fps ) / massive framedrops compared to other AMD (6xxx) and GTX 5/6xx gpus with around 30 fps.

Max fps seems to be more equal on the fastest gpus here. Even my old 9600 GT(running on C2D cpu) with 30 max FPS (BASIC) has 9 fps MIN - so max/min = 3/1. Yours has 120/8 = 15/1. Others here mostly 3/1 .. 10/1.

Perhaps OS X driver in case of new AMD 7xxx isnt already perfect for that bench - which squeezes out every card with high load.

It would be min of 31.1, but the 8FPS happens at the same spot each time, it freezes for half a second right when the first lightning strike of the benchmark happens no matter what settings are used. IDK?

yep, your massive framedrop at that bench position (when first rain drops came up) can have many reasons:

- bench (+ OS X gpu driver) cant handle VRAM memory usage correct

- bench (+ OS X gpu driver) cant handle the used shader programm correct

....

Also my gpu stalls a very short time at thos bench position drops from 17 fps very short down to 9 fps.

 

Maybe the bench (and the specific os x driver) has some prob using vram memory correct oder with the used rain drop shader programm.

VRAM usage problems could be, because also AA (antialiasing) makes probs on all AMD gpus. AA has an major effect for VRAM usage, so perhaps some driver specific VRAM usage probs?

 

You could use atMonitor (attached, don't update because newer versions removed gpu part) to look (in realtime) after the VRAM / gpu usage beside running valley bench in windowed mode and setup atMonitor gpu % and gpu VRAM usage for menue or floating window values.

I am unsure if the AMD drivers also support VRAM usage "looking" as Nvidia. Give a try. rob from barefeats tested it with GTX 580 and there valley used about 1 GB VRAM (higer res + AA + ultra)

 

Looks like ( if used as menue setting) for my 512 MB card - even without AA and BASIC preset near all VRAM is used.

VRAM usage never gets over around 95% - if more VRAM needed, the driver swaps VRAMRAM for new content over the PCIe Bus. That swappings takes very much time compared to already loaded textures into VRAM.

 

Bildschirmfoto 2013-02-28 um 08.02.23.jpg

Bildschirmfoto 2013-02-28 um 08.01.53.jpg

 

PS: I also tested windowed vs fullscreen basic. Very minimal diff between both. Only 1 FPS diff average.

atMonitor.zip

Osx broke VRAM monitoring on 5xxx series and newer cards since around 10.7.3, it can still be monitored with openGL driver monitor to get figures, problem was/is that total available VRAM became a negative number making all monitoring apps show 100% usage. With openGL driver monitor you can expand it's range to allow you to see negative and positive numbers so it is still a way to get comparitive figures even though the exact numbers reported might not be accurate. I'll give it a run in a little while and post it's output during that scene

OK here is the results, highlighted the relevant time frame (it's pretty obvious), only displaying the data sets that have info, most the other monitoring options are just 0. This is not when the rain starts, but in the following scene almost exactly when the first lightning flash happens and the screen is focused on a large boulder.

AMDRadeonX4000GLDriver_.jpg

What's also interesting is in windowed mode, while same thing does happen, if a portion of the window is covered up by another open window (monitor tool or a finder window) then it does not have the same problem at that spot in the bench and there's nothing unusual in the monitor readings. Moving half or most of the window off the desktop it will still stutter. Maybe a problem with an object or a buffer being brought forward?

I also used OpenDL Driver Monitor (Dev tools).

Running Valley in small window (640x360) to show Monitor beside.

I used low quality up to ultra + 8*AA (AA uses much more VRAM.

Beside that i run an other VRAM usage OpenGL/CL App OceanWave (needs about 70 MB VRAM).

I can see, that using ultra+8*AA (or for sure ultra + much higher res) the 5120 MB VRAM gets "out". Means cpu must swap VRAM.

 

That can be seen with the parameters : CPU Wait for VRAM Heap Allocation - which never gets more than 0 if VRAM is enough.

Also Texture Page Off Data gets > 0. Current Mapped DMA Memory in this case gets near 800 MB.

All those 3 parameters never got > 0 ( or in case of Mapped DMA much lower max value) if i didnt run Oceanwave beside Ultra+ 8*AA.

 

Your 1GB++ card should not show any non zero results in Texture page off or CPU Wait for VRAM Heap Allocation i think.

 

Pic shows running Valley in LOW and not running OceanWave OPenCL bench beside. If i run in Ultra+8*AA beside OceanWave i get big positive CPU Wait for VRAM Heap alloc and also Texture Page Off Data > 0.

Bildschirmfoto 2013-02-28 um 13.04.44.jpg

×
×
  • Create New...