Jump to content











Photo
- - - - -

imDisk performance: awealloc much better but still strange


  • Please log in to reply
6 replies to this topic

#1 MrPete

MrPete
  • Members
  • 3 posts
  •  
    United States

Posted 08 October 2012 - 12:14 PM

Hi!
First post here. I'm a long term tech guy, wrote one of the first high performance memory test apps (way back in '286 days ;) )

I have a nice new multicore i7 computer with 32GB RAM (Win7 x64), and am trying to configure a RAM disk for it. imDisk looks very promising.

But performance testing has shown some strange anomalies. I've been using the ATTO disk test, which has been reliable for resolving a number of other anomalies over the last few months.

Here's my results using awealloc:
Posted Image
And here with the native allocation method:
Posted Image

I used exFAT for both examples, and default sector size. Some observations and questions:

1) awealloc gets MUCH better results for large transfers. Could this be a bug in the "native" algorithm? If not... I'd recommend adding awealloc to the *.cpl UI as a simple option for non-virtualized drives.

2) It seems strange that 32kb awealloc transfers are so horrendously slow. This is 100% consistent for me. Any ideas?

3) The 16kb transfers consistently out-run all others. Again, any ideas? I'm not complaining about the speed on larger awealloc transfers... perhaps we're just seeing evidence of filesystem overhead there. Yet perhaps this is another opportunity for a solid performance gain overall.

Hope this is helpful!
Pete

#2 Wonko the Sane

Wonko the Sane

    The Finder

  • Advanced user
  • 10,974 posts
  • Location:The Outside of the Asylum (gate is closed)
  •  
    Italy

Posted 08 October 2012 - 01:03 PM

"Stupid" note, I know, but ATTO is not (IMHO) the "perfect" app to test a ramdisk driver (BTW, I personally prefer ATTO to many other tools).

JFYI there is a recent test made by Raymond:
http://www.raymond.c...nd-write-speed/
using CrystalMark, which apparently says that one ramdisk is worth another.
And a nice .pdf where both ATTO and Crystalmark are used:
http://www.google.it...jZWU9RgUseNc4xQ

What I seem to read "between the lines" of the resuts of the ATTO test you posted, is that native IMDISK is optimized for blocks up to 32 Kb in size, whilst awealloc is better tuned to larger ones.

Of course out of the THREE commonly used filesystems in use; FAT(16), FAT32 and NTFS, you managed to test the thingy in a FOURTH one (the less used, less known) exFAT. :w00t:

Yes, a test with some other filesystem would IMHO be needed to understand if exFAT has anything to do with the "sudden drop" in performance.

:cheers:
Wonko

#3 MrPete

MrPete
  • Members
  • 3 posts
  •  
    United States

Posted 08 October 2012 - 03:42 PM

"Stupid" note, I know, but ATTO is not (IMHO) the "perfect" app to test a ramdisk driver (BTW, I personally prefer ATTO to many other tools).

I don't know what a "perfect" test app might be. I just like that ATTO provides a sensible series of transfer sizes, allows direct IO, and allows a nicely loaded queue.

The first link you provided is interesting; the second so outdated (32 bit, 2009 when 4G of DDR3 was US$200!) I would pay no attention.

I'll add other FS tests when I get a chance. NTFS had the same result on native algorithm. I only tried exFAT because I was looking for a solution...

Blessings,
Pete

#4 Wonko the Sane

Wonko the Sane

    The Finder

  • Advanced user
  • 10,974 posts
  • Location:The Outside of the Asylum (gate is closed)
  •  
    Italy

Posted 08 October 2012 - 05:57 PM

I don't know what a "perfect" test app might be.

Neither do I :blush: (hence the "self-marking" the note as "stupid" )

I just like that ATTO provides a sensible series of transfer sizes, allows direct IO, and allows a nicely loaded queue.

Yes, but the dramatic drop that you recorded at 32 Kb with awealloc (but also the same one you get at 128 Kb with "direct" mapping) sounds a bit too much to be all the fault of IMDISK or of exFAT, hence I suspect that *somehow* ATTO is at least part of it, or *somethng else* is playing a role in it. :unsure:
Maybe a comparative test on the same hardware with another ramdisk and with other filesystem may clear a bit this matter.
As well doing the same set of tests with another bechmark app.

The first link you provided is interesting; the second so outdated (32 bit, 2009 when 4G of DDR3 was US$200!) I would pay no attention.

Yes, but it might be useful to "compare" ATTO vs, Crystalmark results.
As I see it they are so different that one or the other (or BOTH :ph34r:) are not in any way "reliable" as a "measuring device".
Still on the "old" .pdf there are a couple of ATTO results (namely for the misknown "Gili Ramdisk" but also for he widely used/known "QSOFT Ramdisk Enterprise" that simply make NO sense whatsoever (the "Gili Ramdisk" one, that at a certain size shows "no writes") and that show a very similar effect (the "QSOFT Ramdisk Enterprise") on the 32 Kb size (besides another "no writes" at 2048 Kb) that *somehow* reinforces my feeling that ATTO may have some quirks (at least when benchmarking ramdisks).

An interesting tool (also oldish) that is worth a test is this one:
http://www.iozone.org/
seemingly there is a part of "chosen filesystem" that may affect performance.

:cheers:
Wonko

#5 MrPete

MrPete
  • Members
  • 3 posts
  •  
    United States

Posted 08 October 2012 - 07:18 PM

Yes, but the dramatic drop that you recorded at 32 Kb with awealloc (but also the same one you get at 128 Kb with "direct" mapping) sounds a bit too much to be all the fault of IMDISK or of exFAT...

The results actually do make sense to me. As I noted, I've used ATTO to do initial diagnostics on other drivers over the last few months.

Some drivers are unable to handle large reads or large writes -- they just hang. That's one way to get a zero speed result.

Some drivers have timing issues, or interrupt-handling challenges. Firewire drivers are notorious for this. Two months ago I has a USB3 driver that was ridiculously slow, unless I inserted an extra length of cable. Turns out it couldn't properly handle timing... A registry mod and new driver version took care of it.

Crystal only tests a very small number of transfer sizes and queue lengths. Fine for testing those exact parameters, but data tends to come in various shapes (so to speak) :) ... I'll do some comparisons after I get home tonight. I think it won't be too hard for me to demonstrate equivalence between the benchmarks. All we need is to replicate settings. The nice thing about ATTO is we can set it up pretty much any way we like :)

It's true that in theory part of the problem could be ATTO, but so far 100% of anomalies have been something else, not the diagnostic tool.

(Just realized: maybe we can test with the Zero and/or Random drivers as well. That's equivalent to /dev/null -- a good way to ensure that the general data path is reliable!)

#6 Wonko the Sane

Wonko the Sane

    The Finder

  • Advanced user
  • 10,974 posts
  • Location:The Outside of the Asylum (gate is closed)
  •  
    Italy

Posted 08 October 2012 - 07:43 PM

(Just realized: maybe we can test with the Zero and/or Random drivers as well. That's equivalent to /dev/null -- a good way to ensure that the general data path is reliable!)

Maybe useful, maybe not:
http://reboot.pro/15207/
http://home.comcast....csi/tools/pldd/


:cheers:
Wonko

#7 Olof Lagerkvist

Olof Lagerkvist

    Gold Member

  • Developer
  • 1,027 posts
  • Location:Borås, Sweden
  •  
    Sweden

Posted 10 October 2012 - 11:20 AM

interesting tests and interesting discussion!

I think that I could provide some technical background to the different memory allocation mechanisms in memory disk drivers. It basically gets down to the different memory allocation functions available to Windows kernel mode drivers and their pros and cons for various usage cases.

ExAllocatePool and related functions.
These functions allocate memory from the kernel allocation pools. There is paged pool and non-paged pool, indicating whether allocated memory can be used in "non-pageable" paths in the driver, such as in interrupts routines, disk swapping requests etc. The pools are intended for small(ish) allocations, such as data structures etc needed in the drivers. Pool spaces are restricted by the kernel, especially in 32 bit Windows versions.

These pool function are frequently used in sample source code of "my first ramdisk driver" type. They tend to be quick and reliable for small ramdisks because they can carry out the entire I/O request within the context of the calling process, something which is otherwise very uncommon for any kind of disk driver, virtual or not.

This allocation method is not used in ImDisk.

MmMapIoSpace.
This function directly maps physical addresses for usage within system virtual address space. There is no allocation logic, the calling driver needs to know in some other way that it is safe to use the addresses it maps. If another component uses the same memory, data corruption is likely to happen.

This memory access method is used in some ramdisk drivers to allocate physical memory beyond the point that Windows allows access to, for example above 4 GB or 32 bit Windows XP. The users of such driver need to understand that they cannot use two drivers, using this memory access method, at the same system simultaneously.

This allocation method is not used in ImDisk. However, there have been thoughts and ideas about developing a driver similar to awealloc that could access memory in this way.

MmAllocatePagesForMdl and related functions.
This function allocates physical, non-pageable, memory address ranges. If AWE is used in Windows, it can allocate beyond 4 GB on 32 bit systems, if Windows license allows it. The allocated memory is not immediately available in the system address space. To access the allocated memory, a driver needs to map needed parts into virtual address space. This requires a "page-access" logic in the driver code, which in the case of awealloc is implemented using 2 MB block mapping. Therefore, with awealloc, 2 MB of allocated memory is mapped into virtual address space at any time. If the application or filesystem requests I/O to a place within the same 2 MB as last I/O request, no remapping is necessary. This gives some overhead to carrying out I/O requests to very random areas, which seems to be common for example with NTFS file system and less frequent with FATxx/exFAT.

ZwAllocateVirtualMemory (or ZwCreateSection/ZwMapViewOfSection without a file handle)
These two allocation methods allocate an address range in virtual address space. They are pretty "high-level" and correspond directly to VirtualAlloc (or using MapViewOfFile/CreateFileMapping without a file handle) in user mode applications. This is the method used when you create a virtual memory backed virtual disk directly in ImDisk, without using awealloc. The implementation in ImDisk is such that the entire memory needed is allocated in one single block. This means that the space necessary for the entire vm disk needs to be available directly within kernel address space. Especially on 32 bit versions of Windows, this part limits the use for this allocation method for anything but rather small virtual disks. If a large amount of kernel address space is allocated, necessary allocations in other drivers may fail and cause system instability.

On the other hand, this allocation method provides a very simple usage interface to the driver. All page committing, mapping etc is handled by the kernel in background and transparent to the driver. The driver just sees a large memory block that it can assume is directly accessible, even if the pages it is about to access are paged out to disk or not even physically allocated yet.

This allocation method was implemented in ImDisk for use with smaller allocations. I used to use ImDisk for virtual floppy disks in this way, or for copying small(ish) .iso images to memory when installing applications etc. This proved to be, back then, a good solution because it was not necessary to keep the memory disk contents in physical memory at all times (it is pretty okay for a virtual floppy disk to be swapped out to pagefile when not needed). The goal was more to break connection with any physical image file that could be on some removable disk or network location.

Summary.
The various memory access/allocation methods are good for different things. It is quite expected that measured performance will differ a lot. But my general opinion is that for memory backed virtual disks, there is no good reason to use internal ImDisk allocation for disks larger than, say 100-200 MB or so. Awealloc should really be used instead. I have also thought about an easier way to use awealloc from GUI. You could type \\.\awealloc in the filename box and provide a disk size in the next box, but it is not very "self-explaining".

I have pretty much stopped maintaining the Control Panel applet though. It is hopelessly outdated for many reasons. Not only graphically, but it does not fit into the UAC logic in Vista and above pretty well either. So, I was kind of hoping that someone could develop a more modern GUI, using the nowadays quite powerful API. It seems that some people have done that. But as far as I know, none of them have been released for free.
  • Sha0 likes this