Using a block size of 512K means that it can be entirely held in CPU cache, since you have 1MB per thread. And it's supposed to increase the speed...
Oh, never thought of that. So I have a total of 24MB cache on my system, (16 threads) x (512K block size) = 8MB, well within cache limit. Here's a test which i think illustrates what you are saying? It is just a memory (not a ramdisk) benchmark:
You can see a sudden drop between 8MB and 16MB, this is because my L3 cache (12MB) is full and can't hold the entire block in one pass. v77, I'm sure you know this inside out already but I'm just posting this in case there are people struggling to understand this stuff like myself.
^^ Problem - The above benchmark uses only 2 threads and gets 9.8GB/s. Was I wrong about more threads giving more bandwidth?
In STREAM copy benchmark, it both Reads and Writes data, this may be slower and more expensive on system resources than just reading data from memory. This may explain how 2 threads in the above benchmark got 9.8GB/s read time, and 2 threads in STREAM got only 4.5GB/s.
And it's supposed to increase the speed...
I'm confused, didn't it? The benchmark says it has read 121GB of random data from the ramdisk in ~12.5 seconds (timed with stopwatch). That's a lot of data going from the ramdisk to the cpu in a small amount of time. Isn't this good? Please forgive my ignorance, I'm just a pigeon pecking on a keyboard. It seems you're correct though, robocopy and richcopy, which are multi-thread programs are not giving ~10GB/s transfer rates on the radeon multi-threaded ramdisk, only standard ramdisk rates of 2-4GB/s.
If a thread can achieve 3.2 GB/s, with 16 threads, you should get something near than 51.2 GB/s.
(I believe) The maximum experimental memory bandwidth of my system is 10.9GB/s benchmarked by STREAM. I don't know why it's this low, but I bet this guy does - http://www.cs.virginia.edu/stream/ .
Given the current result, it means that your CPU spend about 80% of its time to wait for data, stuck at instructions that read or write something. So, this time can not even be used to do something else!
Dual monitor screenshot: There was no lag in the video or open programs that I could observe while running benchmark. I'm sure there's a lot of truth to what you are saying, but I didn't notice any change in performance so I think it's still ok for normal pc activities?
Edited by pigeon, 22 July 2014 - 11:31 AM.