Jump to content











Photo
- - - - -

[RELEASE] idd.g4b interactive dd command composer


  • Please log in to reply
59 replies to this topic

#1 Wonko the Sane

Wonko the Sane

    The Finder

  • Advanced user
  • 16066 posts
  • Location:The Outside of the Asylum (gate is closed)
  •  
    Italy

Posted 27 September 2021 - 02:52 PM

I had this half baked idea since forever, but never thought of actually putting it together, now that I became more familiar with a trick or two in grub4dos batch (thanks Steve6375 and deomsh) I hopefully managed to create something useful (and working).

 

Still needs to be tested, and a few sanity checks are still missing, but it seems like it has "robust enough" input, anyway the "composed" command needs to be executed manually so it should be relatively safe.

 

Have fun.

 

Version 0.2 added, no changes but a new refresh method to rebuild the screen to avoid some of the flickering in normal operations. 

 

Version 0.3 added, dd command line is now visible on two lines (or roughly 160 characters) and as well if and of can be up to two lines (but of course the tail of the full dd line is still limited to total 2 lines visible), non existing file error now visible, added the command history num fix (even if unneeded), made a few fixes here and there. 

 

Version 0.4 added, older versions removed, now the default for count (or count=0) is the size of infile, fixed (hopefully) the other little issues highlighted on post #12

 

:duff:

Wonko

Attached Files



#2 deomsh

deomsh

    Frequent Member

  • Advanced user
  • 196 posts
  •  
    Netherlands

Posted 29 September 2021 - 05:26 PM

Nice!

Seems to work good (tested v0.1).

Only if final ddcmd>79 is echod, tail is not visible. And if source/ target not exist, red warning below almost instantly overwritten.

BTW: copying to History Buffer is good too, only oldest history entry is lost (still in History Buffer, but not 'callable' anymore because 'num_history' is not updated by the script). This is not a real problem in this case, after running the script, num_history is always (much) >1 (only one entry will be added by the script and use of 'set /p' raises num_history).

BTW2: sad there is no verify option. After first attempt on Limbo x86 on my smartphone 'cmp' reported one error. Second write was 'good'. But Limbo x86 is quite unstable...

#3 Wonko the Sane

Wonko the Sane

    The Finder

  • Advanced user
  • 16066 posts
  • Location:The Outside of the Asylum (gate is closed)
  •  
    Italy

Posted 30 September 2021 - 07:18 AM

Yep, though it is really rare that the composed dd command is actually more than one line, but I think it can be fixed.

 

I will check about the instantly overwriting the error message, probably it can be timed with a pause --wait=3 or similar :unsure:

 

About the History buffer I just push the dd cmd on the stack, without updating the number of entries, (but of course it can be done, I didn't find quickly enough what to update in your keybuff.g4b script) but as you say it is not a real issue, as the oldest command, in the "best" case is the /idd.g4b command used to call the batch or otherwise a much older one.

 

I can think about the verify option, that would be basically running cmp right? :dubbio:

 

At first sight there are two issues that I can see:

1) to have execute the comparing the composed dd command would need to be run inside the batch script "sandbox" (unlike now where it is run manually after the batch has run).

2) cmp seems to me like having no provisions for "partial" files :dubbio:

 

Idea  :idea:  instead of verifying by comparing, use crc32 for hash verifying, then when I find (please read as "you tell me") how to update properly the command history buffer, the batch could push to it:

1) the verify command (i.e. calculating the hash of the copied bytes on both the source and target and comparing them), I believe it can be made into a one-liner by re-calling the idd.g4b with parameters

2) the actual dd cmd 

 

the user then can recall the dd command, run it, and optionally find (two commands "up") the verifying command and run it

 

OT (the SectEdit.g4b) I am inserting the three way toggling Hints/History/Hic and your suggestions/ideas about the P[R]ompt sandbox but it will take some time, I need to change a couple of approaches with the updating of the footer to make it working decently.

 

:duff:

Wonko.



#4 deomsh

deomsh

    Frequent Member

  • Advanced user
  • 196 posts
  •  
    Netherlands

Posted 30 September 2021 - 09:32 PM

About updating num_history: I don't think it's really needed, but here are two subroutines you can use. First subroutine: run once to get NHistory in a variable (line with echo optional of course).
:NHistory
setlocal && set *
cat --skip=0x340000 --locate=\x00\x00\x00\x00\x00\x92\x3E\x00 (md)0x0+0x1D00 > nul &; set /A ZHistory=%?% > nul
set /A SHistory=%ZHistory%+0x8 > nul
raw read %SHistory% > nul ;; set /A read=%@retval% > nul
if not %read%==0x6 && set /A SHistory=%SHistory%+0x4 > nul &; raw read %SHistory% > nul &; set /A read=%@retval% > nul
if %read%==0x6 && set /A NHistory=%SHistory%+0x10 > nul
if not %read%==0x6 && set NHistory=
if not exist NHistory && echo num_history not found, no copying to History Buffer Available. && echo $[0x0F]Press a key to continue... && pause
endlocal && set NHistory=%NHistory% && goto :eof
BTW: Remember NHistory is NOT a fixed memory address, but found with an experimental method (different in many Grub4dos builds). Method is working so far (using other base than (md)0x0 didn't work - as far as I remember).


Second subroutine: run to raise num_history with 1 (second read-out of NHistory to get updated numhist and keeping after endlocal optional).
:add_hist
setlocal && set * && set NHistory=%NHistory%
if not exist NHistory && endlocal && goto :eof
raw read %NHistory% > nul ;; set /A numhist=%@retval% > nul
set /A numhist=%numhist%+0x1 > nul
raw write --bytes=2 %NHistory% %numhist%
raw read %NHistory% > nul ;; set /A numhist=%@retval% > nul
endlocal && set numhist=%numhist% && goto :eof
BTW: maximum number of entries in History Buffer is 8000/4=2000 (each one char+2 header and 1 tail). Readout num_hist can be refined with &0xFFFF, but last to bytes (Little-Endian order) are always zero as far as I remember.


About verify:

Following simple loop is always a good idea:
set /a d=1
:LoopFileCopyVerify
raw dd if=%FILE1% of=%FILE2% %bs% %count% %skip% %seek% %buf% %buflen% > nul
set /a result=%@retval%
#echo %0: result=%result%
if %d%<=3 && if %result%==0 && set /a d=%d%+1 && goto :LoopFileCopyVerify
About cmp/ CRC32: both don't have any skip/ length option other than whole sectors (in case of block-devices).

If in_file/ of_file are both files-on-disk with equal size, cmp is usable (no skip/ seek), CRC32 too. Both also in case of block-devices with same size.

But if only one is a file, not equal to a whole multiple of sector-size, and the other a block-device: the unseen tail of the file will give problems (max 511 bytes, normally NOT cleaned - in my experience).

If in_file is a block-device and of_file a file of equal length (including tha tail of the file), dd can also copy to the blocklist of the file, so cmp is possible. In the reverse too.

CRC32 is needed in case of following work-around I have used: first copy in_file with bs/ count/ skip to ram-drive with same size to get CRC32 (from (rd)+1), then same procedure for of_file with seek. Verify is indirect in this case and copy-size is limited.

Running everything in a sandbox inside your script will give more control, but that's up to you.

BTW I do not understand how you can get a hash of copied bytes not equal to whole multiples of sectors and while using skip/ seek?

#5 Wonko the Sane

Wonko the Sane

    The Finder

  • Advanced user
  • 16066 posts
  • Location:The Outside of the Asylum (gate is closed)
  •  
    Italy

Posted 01 October 2021 - 11:37 AM

It is seemingly undocumented, but Steve6375 found it, the length after the comma allows to crc32 a number of bytes (on block devices):

http://reboot.pro/in...=22553&p=219305

 

One could access the file (if it is a file) via its blocklist, but the issue remains then not with the "tail" but with the (possible) initial offset/skip, if not on a sector border.   :frusty:

 

One can still pipe in crc32 the out put of a cat --hex command, as the command will also calculate the crc32 of a string, BUT then the different output of the cat --hex (numerical address) will ruin everything, and if we proceed 16 bytes at a time it will probably take forever :dubbio:.

 

Quick, half-@§§ed, example attached.

 

Sum of crc may - in theory - create the possibility of collisions, but it is unlikely/improbable, in the sense that eventual corruption in the dd process is very unlikely to create the kind of data that may create a collision.

 

When I had a look at my own (batch) implementation of crc32, it came out that the zlib implementation of crc32 allows also partial (progressive) crc32's calculation (or if you prefer to calculate crc32 given a start "seed"), but I stopped looking at it because the issue at hand was solved with the comma trick and because when you look for anything concerning crc 32, you are likely to find posts by everyone about using zlib or answers by Mark Adler (one of the Authors) that point you back at zlib and its implementation, which is in C, and that I have serious problems in understanding.

 

Maybe we could ask yaya/chenall/etc. if it would be possible to have a "raw cat --hex" or a "cat --hex --noaddress" form of the command omitting the left (or both left and right) "side panels" in the output .

 

Other ideas? :unsure:

 

:duff:

Wonko

Attached Files



#6 Wonko the Sane

Wonko the Sane

    The Finder

  • Advanced user
  • 16066 posts
  • Location:The Outside of the Asylum (gate is closed)
  •  
    Italy

Posted 02 October 2021 - 09:34 AM

About updating num_history: I don't think it's really needed, but here are two subroutines you can use. First subroutine: run once to get NHistory in a variable (line with echo optional of course).

 

Ok, I finally got it. :)

I was confused by the numHorg (which is probably useful for other scopes but unneeded in this case) and by the if condition with the 0x4 "jump".

 

I changed the narrative from "look for a pattern, move 8 bytes from it, check its value, if it is not 6 move some more 4 bytes, if it is not 6 give up else move 16 bytes and set value" to this other one (IMHO more direct) "look for a pattern, find next 6, move 16 bytes, set value, if it is not within set distance give up", results should be the same.



:my_hist_num
#let's find a given pattern (found by deomsh)
cat --skip=0x340000 --locate=\x00\x00\x00\x00\x00\x92\x3E\x00 --number=1 (md)0x0+0x1D00 | set foundpat=
#let's find the first occurence of 0x6 after it (as it may change on different grub4dos versions) 
cat --skip=0x0%foundpat% --locate=\x06\x00\x00\x00 --number=1 (md)0x0+0x1D00 | set found6=
#move 16 bytes forward from the found 6
set /A foundh=0x0%found6%+0x10
set /A hist_num=*%foundh%&0XFFFF
#now check that distance is between 0x18 and 0x1C bytes, otherwise no good
checkrange 0x18:0x1C calc %foundh%-0x0%foundpat% || set hist_num= && echo History Num could not be found && pause --wait=3
goto :eof

:my_add_hist
set /A hist_num=%hist_num%+1
raw write --bytes=2 %foundh% %hist_num%
goto :eof

:duff:

Wonko



#7 deomsh

deomsh

    Frequent Member

  • Advanced user
  • 196 posts
  •  
    Netherlands

Posted 02 October 2021 - 07:52 PM

About your 'variation' on my call ':NHistory': tested on latest version of grub4dos 4.6a AND on latest version of 4.5c. Passed!

Please don't forget to run your call ':my_hist_num' before any set /p command (unless the message part of the set /p cmd is EXACTLY six chars long - see first page of the Keyboard Buffer-thread)

About my call ':add_hist': numHorg was only needed in KEYBUFF.G4B as output for the Sync-functionality.

But I think it's NOT a good idea to NOT read-out num_history every time before writing num_history+1 back (variable hist_num in your 'variation'). Because each set /p command will raise num_history too (if not empty). Otherwise you will loose access to (oldest) part of the History Buffer and you will need to run KEYBUFF.G4B (menu-item) 'Sync' to synchronize num_history.

BTW: I can't test on UEFI-grub4dos, maybe someone can try. Just run 'read %NHistory%' with 'debug 1' from the command-line (or %result% after calling :NHistory in my script, or 'read %foundh%' in Wonko's). Really funny to see screen-output is always one higher after each read-command.

#8 Wonko the Sane

Wonko the Sane

    The Finder

  • Advanced user
  • 16066 posts
  • Location:The Outside of the Asylum (gate is closed)
  •  
    Italy

Posted 03 October 2021 - 09:37 AM

Good call  (pardon me the pun) to the fact that one needs to (if needed) increase the hist_num immediately after having read its value, easiest would be to use the same subroutine for both, i.e something like:
!BAT
#cmdbuffer.g4b - sample to read and increase the number of commands in command buffer
setlocal
debug msg=0
call :my_hist_num
set hist_num
call :my_add_num
set hist_num && set hist_num=
debug msg=3
goto :eof

:my_hist_num
:my_add_num
#let's find a given pattern (found by deomsh)
cat --skip=0x340000 --locate=\x00\x00\x00\x00\x00\x92\x3E\x00 --number=1 (md)0x0+0x1D00 | set foundpat=
#let's find the first occurence of 0x6 after it (as it may change on different grub4dos versions) 
cat --skip=0x0%foundpat% --locate=\x06\x00\x00\x00 --number=1 (md)0x0+0x1D00 | set found6=
#move 16 bytes forward from the found 6
set /A foundh=0x0%found6%+0x10
set /A hist_num=*%foundh%&0XFFFF
#now check that distance is between 0x18 and 0x1C bytes, otherwise no good
checkrange 0x18:0x1C calc %foundh%-0x0%foundpat% || set hist_num= && echo History Num could not be found && pause --wait=3
if exist hist_num if "%0"==":my_add_num" set /A hist_num=%hist_num%+1 &; raw write --bytes=2 %foundh% %hist_num%
set foundpat=
set found6=
set foundh=
goto :eof

or, even easier, calling the same subroutine with a +1 parameter:
!BAT
#cmdbuffer2.g4b - another sample to read and increase the number of commands in command buffer
setlocal
debug msg=0
call :my_hist_num
set hist_num
call :my_hist_num +1
set hist_num && set hist_num=
debug msg=3
goto :eof

:my_hist_num
#let's find a given pattern (found by deomsh)
cat --skip=0x340000 --locate=\x00\x00\x00\x00\x00\x92\x3E\x00 --number=1 (md)0x0+0x1D00 | set foundpat=
#let's find the first occurence of 0x6 after it (as it may change on different grub4dos versions) 
cat --skip=0x0%foundpat% --locate=\x06\x00\x00\x00 --number=1 (md)0x0+0x1D00 | set found6=
#move 16 bytes forward from the found 6
set /A foundh=0x0%found6%+0x10
set /A hist_num=*%foundh%&0XFFFF
#now check that distance is between 0x18 and 0x1C bytes, otherwise no good
checkrange 0x18:0x1C calc %foundh%-0x0%foundpat% || set hist_num= && echo History Num could not be found && pause --wait=3
if exist hist_num set /A hist_num=%hist_num% %1 &; raw write --bytes=2 %foundh% %hist_num%
set foundpat=
set found6=
set foundh=
goto :eof
 
 
:duff:
Wonko

#9 deomsh

deomsh

    Frequent Member

  • Advanced user
  • 196 posts
  •  
    Netherlands

Posted 03 October 2021 - 09:14 PM

This is all very nice and very good, but I am not sure if you did get the point.

 

 

Please don't forget to run your call ':my_hist_num' before any set /p command (unless the message part of the set /p cmd is EXACTLY six chars long - see first page of the Keyboard Buffer-thread)

 

I re-tested, see print-screen below.

 

Screenshot_NHistory_found_NOT-found.png

 

BTW: If tested from the 'grub> '-command-line, the magic number '06' can be found, because the message-part of this cmd is always six chars long. With 'set /p' INSIDE a script not anymore (like IDD.G4B), unless the last 'set /p'-cmd in a script before :NHistory is called (or your :my_hist_num) has a message part of six chars. For instance 'Wonko>' or 'deomsh' (without quotes). :lol:



#10 Wonko the Sane

Wonko the Sane

    The Finder

  • Advanced user
  • 16066 posts
  • Location:The Outside of the Asylum (gate is closed)
  •  
    Italy

Posted 04 October 2021 - 09:11 AM

Now I see what you mean, thanks :).

 

What if we force :innocent:  a 6 , i.e. can we get away with this added to the beginning of the sub? :unsure:

set /p:1 dummy=Wait..

or :idea: forcing a specific length (and look for that length instead of 6), in an interactive batch like idd.g4b the 1 second delay won't make any difference and it would be hardly noticeable, i.e.:

!BAT
#cmdbuffer2.g4b - another sample to read and increase the number of commands in command buffer
setlocal
debug msg=0
call :my_hist_num
set hist_num
call :my_hist_num +1
set hist_num
call :my_hist_num -1


set hist_num && set hist_num=

debug msg=3
goto :eof

:my_hist_num
#let's find a given pattern (found by deomsh)
cat --skip=0x340000 --locate=\x00\x00\x00\x00\x00\x92\x3E\x00 --number=1 (md)0x0+0x1D00 | set foundpat=
#force the prompt length magic number to 42 or 0x2A
#--------------123456789012345678901234567890123456789012
set /p:1 dummy=Please wait while comnand history is read.
set /A dummy=42
#let's find the first occurence of 0x2A after it (as the offset may change on different grub4dos versions)
cat --skip=0x0%foundpat% --locate=\x2A\x00\x00\x00 --number=1 (md)0x0+0x1D00 | set foundmn=
#move 16 bytes forward from the found magic number
set /A foundh=0x0%foundmn%+0x10
set /A hist_num=*%foundh%&0XFFFF
#now check that distance is between 0x18 and 0x1C bytes, otherwise no good
checkrange 0x18:0x1C calc %foundh%-0x0%foundpat% || set hist_num= && echo History Num could not be found && pause --wait=3
if exist hist_num set /A hist_num=%hist_num% %1 &; raw write --bytes=2 %foundh% %hist_num%
set foundpat=
set foundmn=
set foundh=
set dummy=
goto :eof

:duff:

Wonko

:



#11 Wonko the Sane

Wonko the Sane

    The Finder

  • Advanced user
  • 16066 posts
  • Location:The Outside of the Asylum (gate is closed)
  •  
    Italy

Posted 05 October 2021 - 02:59 PM

Ok, confirmed, 42 is the answer, version 0.3 uploaded with the fixes talked about, + a few more, no cmp/crc32 verification of sorts (yet).

 

:duff:

Wonko



#12 deomsh

deomsh

    Frequent Member

  • Advanced user
  • 196 posts
  •  
    Netherlands

Posted 05 October 2021 - 07:44 PM

I did some tests and it seems the the length of the 'set /p'-message with EMPTY input is also written somewhere in memory nearby the 'pattern'. I've never tested this before. Very good finding! :thumbup:

 

It's sad 'set /p:0' is not possible, but in your script a small delay is not a problem of course. I used bios to write 'Enter' (0x1C0D) to the BIOS keyboard Buffer before 'set /p' is used to 'reset' the magic '06'-value: works good and is a universal solution (no 1 second delay needed).

set var=
bios int=0x16 eax=0x0500 ecx=0x1C0D > nul
set /p "var=Wonko>" && echo
if not exist var && echo

Screenshot_NHistoryBios.png -

 

BTW: in this build of Grub4Dos the length of variable is found directly after the 'pattern', the message length next and their sum on second row, before num_history.

BTW2: 'set /p'-message can be overwritten, or placed outside screen with a Fn.5-call.

 

 

I have tested IDD.G4B v0.3. Very good!

 

There are two things I do not like:

1) If 'bs' is default (512 bytes), 'count' is set to 1. Why? Argument 'count' is not mandatory, and if I want to copy a whole file 'count=1' needs lots of counting. Luckily command can be edited before final execution. :lol:

 

Screenshot_IDD.G4B_default_count=1.png

 

2) Argument 'buf' can be given in hex, which is good. But later converted to decimal, which I found confusing.

 

Screenshot_IDD.G4B_buf=decimal.png

 

BTW: on second screen in print-screen below can be seen 'idd'-output still one line only.

 

Screenshot_IDD.G4B_output_on_second_screen_max_one_line.png

 

 

About 'buflen' (I am not fully sure), but in the source code (shared.h) it seems to me that 'buflen' should be also a power of 2. Personally I should like to know if this is true, because I will have to rewrite some scripts. :unsure:

/* BUFFERLEN must be a power of two, i.e., 2^n, or 2**n */
/* BUFFERLEN must be 64K for now! */
#define BUFFERLEN   0x10000
#define BUFFERADDR  RAW_ADDR (0x30000) 


#13 Wonko the Sane

Wonko the Sane

    The Finder

  • Advanced user
  • 16066 posts
  • Location:The Outside of the Asylum (gate is closed)
  •  
    Italy

Posted 06 October 2021 - 07:16 AM

The bios - Enter trick seems like a nice workaround to the delay :thumbsup: , though in this particular case I actually "like" this 1 second pause, it looks like the program is actually needing to do something profound and complex ;).

 

The count is 1 only to have a non-zero default, and not make sophisticated checks, there is no problem to set its default to "whole size of source file"  which is the case where in "plain" dd count is omitted :), I'll think a bit about it, I think I will have to modify slightly the way the transfer data is calculated, as now it is - if I recall correctly - "literal", i.e. count*bs.

 

Arguments buf and buflen won't be normally used by anyone (or if you prefer only used by people that won't use idd.g4b but rather "plain" dd directly) I haven't put much attention to them,  but sure buflen needs to be a power of 2 and a power of 2 bigger or equal to 64 K, i.e. values smaller than that won't be taken, as a matter of fact buflen can be input on the idd as (say) 2*64k and will be shown as 131072, changing that is just a matter of changing a couple of set and set /a into set /A, but I doubt that common users would like to see 0x10000 or 0x20000 :dubbio:

 

:duff:

Wonko



#14 Wonko the Sane

Wonko the Sane

    The Finder

  • Advanced user
  • 16066 posts
  • Location:The Outside of the Asylum (gate is closed)
  •  
    Italy

Posted 08 October 2021 - 02:33 PM

New version 0.4 uploaded, should address the issues found.

 

:duff:

Wonko



#15 deomsh

deomsh

    Frequent Member

  • Advanced user
  • 196 posts
  •  
    Netherlands

Posted 08 October 2021 - 10:01 PM

IDD.G4B v0.4 tested: Good :rolleyes:

 

About possibities of verification: steve6375's trick with (md)base+sect,bytes is good, so without skip/ seek no problem (only complications if a file-blocklist is not contigous)

 

In case of skip/ seek I tested your idea of comparing crc32 from batches of 16 bytes with cat --hex. Works okay, not fast, but do-able if limited to let's say 100 sectors.

 

See looptest in print-screen below (switched source to (hd0,0)+1 to see whole command-line of HEXCRC32.G4B).

 

HEXCRC32.G4B (hd0,0)0+1 (hd0,0)-pbrdos71.bin -bytes-446 -skip-62-seek-0 + Looptest IV.png

 

BTW: lately I migrated to SSD, VBox-timings seems much more stable now.

 

I think best option is still the ram-disk one: copy bytes to compare to a ram-disk of same size and compare CRC32 of (rd)+1. Only limited by system memory. At least seven times faster than in print-screen above (speed not fully tested so far).



#16 Wonko the Sane

Wonko the Sane

    The Finder

  • Advanced user
  • 16066 posts
  • Location:The Outside of the Asylum (gate is closed)
  •  
    Italy

Posted 09 October 2021 - 09:48 AM

Yes and no. (as always) :dubbio:.

 

Copying to ramdisk is a possibility, but only useful for smallish files (the ones where slowness is not an issue) but unfeasible for large transfers.

 

I thought a bit around the matter, and we can get away with a couple abstractions and a "mixed" approach[1]. :w00t:

 

There are two possibilities for the objects that are the if and of in dd:

1) a single blocklist or a contiguous file that can be directly translated to a single blocklist

2) a non-contiguous file (or multiple blocklist)

 

There are two possibilities for bs:

1) a power of 2 smaller than 512 bytes, i.e. 1, 2, 4, 8, 16, 32, 64, 128, 256

2) a power of 2 equal to or bigger than 512

 

In case #2  of bs the skip and/or seek can only be multiples of 512 and thus can be managed at block level.

 

In case #1 of bs the skip and/or seek can only affect 1 or 2 blocks (the first and last block of the bytes transferred), let's call these areas conventionally "head" and "tail" respectively, and the *whatever* is between these two areas "body".

 

The "head" can be only anything between 1 and 511 bytes and can be dealt with the crc32 of 16 bytes at the time[2], the whatever* comes after starts on a block (512 bytes) border and can be dealt at block level.

Same for the "tail", with the advantage that - accessed at block level - we can use the crc32 "comma" trick for it.

 

So, in case of contiguos files or blocklists we will have at most three crc32's.

 

The issue than becomes another one, the body of: fragmented files. :ph34r:

 

A file accessed through the filesystem (i.e. (device)/file.ext) is "virtually" contiguous and the crc has no issues with it, but you cannot access it via blocklist as you don't have a single blocklist, but rather a number of blocklists.

 

The challenge is to create a way to "dissect" a non-contiguous file into a list of the single blocks that compose it, than crc32 each of them.

 

In this latter case we will have instead of 3 crc32's, 1+n+1 of them.

 

But we can still put all of them into a string variable and then crc32 the string, the resulting crc32 is not anymore a crc32, but rather a meta-crc32, that anyway can be compared to another meta-crc32 built in the same way.

 

The limit then becomes how many crc32's can fit into a string variable, as each of them represents a block of the transferred files, since the limit for a variable is 512 bytes, it can store 512/4=128 blocks worth of crc32's, which is a rather smallish file.

 

BUT we can still use a progressive crc32 approach, i.e. calculate the crc32 of the string set crc32tillnow=<crc32tillnow><crc32ofthisblock>, it might be a meta-meta crc32, but as long as the comparison term is calculated in the same way the comparison is valid.

 

I invented a crazy flags system to distinguish possible cases (and to adopt the faster strategy for each of them), if my calculations are correct :unsure: there are 108 :w00t: possible cases that can be grouped into (I don't actually know, rough guesstimate) 6-8 groups,  maybe less due to some symmetries, for the moment I identified 3 groups and "solved" only the first two (the actually easy ones ;)).

 

I am attaching (only FYI, very little practical use, covers - maybe - 9 cases out of 108) the "skeleton".

 

:duff:

Wonko

 

 

 

[1] I am working on a basic implementation of it, in theory it may work, in practice it has to be seen.

[2] either directly or copying the partial block to ramdisk, but, since we have at most one single block (sector) to deal with, nothing prevents us from:

a. zeroing a sector in memory
b. copying to it the "head" bytes

c. calculate the crc32 of the "whole" sectors, 00's included <- this will be a sort of "virtual" (besides partial) crc32, but as long as this "virtual" source head matches the corresponding as well "virtual" destination head, the verification is done correctly

 

 

Attached Files



#17 deomsh

deomsh

    Frequent Member

  • Advanced user
  • 196 posts
  •  
    Netherlands

Posted 10 October 2021 - 05:37 PM

Yes and no. (as always) :dubbio:.

 

Copying to ramdisk is a possibility, but only useful for smallish files (the ones where slowness is not an issue) but unfeasible for large transfers.

 

How big/ small are 'smallish files' for instance? I'd run some tests with two 1GB files from the grub4dos command-line. There is no problem to split the copied bytes and compare the batches with crc32. The process will take more time, but seems to be only 50%-60% more - seen the looptests below.

 

On the first print-screen the properties of the two test-files are shown, together with looptest of copying with dd with different skip/ seek (in bytes!). Also the speed of making crc32 of one of he files is shown. Further making settings of (rd).

On the second an third print-screen the process to make the crc32's of the first, respectively the second part of the bytes copied earlier. At the and the overview of the two pairs of crc32.

 

 

Looptest dd & CRC32 (hd0,0)-file1gb.fl1 (hd0,0)-file1gb.fl2 bs-1 -skip-6-seek-256 + (rd)+1 I.png Looptest dd & CRC32 (hd0,0)-file1gb.fl1 (hd0,0)-file1gb.fl2 bs-1 -skip-6-seek-256 + (rd)+1 II.png Looptest dd & CRC32 (hd0,0)-file1gb.fl1 (hd0,0)-file1gb.fl2 bs-1 -skip-6-seek-256 + (rd)+1 III.png

 

BTW: second (rd)+1 should be 256 bytes smaller, but not really important because last 256 bytes are not overwritten.

BTW2: --rd-base and --rd-size doesn't take for instance '1g' respectively '512m', only the 'full' numbers.

BTW3: Not even cleaning (rd) seems to be needed.

 

It's noteworthy speed is dramatically increased with buflen (above say 16m/ 32m there is not so much gain anymore. Seems to be decent values). See print-screen below.

 

SPEEDTEST dd (hd0,0)-file1gb.fl1 (hd0,0)-file1gb.fl2 bs-1 -skip-6-seek-256 + buf=64m buflen=Def-64m X.png

 

BTW: I added the test because it can be of general interest

 

The "head" can be only anything between 1 and 511 bytes and can be dealt with the crc32 of 16 bytes at the time[2], the whatever* comes after starts on a block (512 bytes) border and can be dealt at block level.

Same for the "tail", with the advantage that - accessed at block level - we can use the crc32 "comma" trick for it.

In case head & tail are of equal length, I understand this approach. But if skip/ seek are NOT equal, the crc32's of the middle blocks will never be equal. Or did I miss something? I have been reading a bit about combining CRC's, should be possible, but looks quite complicated. I doubt this will be possible with calc. But the math seems to be above my level :blink: so maybe I am too pessimistic.

 

A file accessed through the filesystem (i.e. (device)/file.ext) is "virtually" contiguous and the crc has no issues with it, but you cannot access it via blocklist as you don't have a single blocklist, but rather a number of blocklists.

 

The challenge is to create a way to "dissect" a non-contiguous file into a list of the single blocks that compose it, than crc32 each of them.

 

In this latter case we will have instead of 3 crc32's, 1+n+1 of them.

 

But we can still put all of them into a string variable and then crc32 the string, the resulting crc32 is not anymore a crc32, but rather a meta-crc32, that anyway can be compared to another meta-crc32 built in the same way.

 

The limit then becomes how many crc32's can fit into a string variable, as each of them represents a block of the transferred files, since the limit for a variable is 512 bytes, it can store 512/4=128 blocks worth of crc32's, which is a rather smallish file.

 

BUT we can still use a progressive crc32 approach, i.e. calculate the crc32 of the string set crc32tillnow=<crc32tillnow><crc32ofthisblock>, it might be a meta-meta crc32, but as long as the comparison term is calculated in the same way the comparison is valid.

 

I invented a crazy flags system to distinguish possible cases (and to adopt the faster strategy for each of them), if my calculations are correct :unsure: there are 108 :w00t: possible cases that can be grouped into (I don't actually know, rough guesstimate) 6-8 groups,  maybe less due to some symmetries, for the moment I identified 3 groups and "solved" only the first two (the actually easy ones ;)).

 

I am attaching (only FYI, very little practical use, covers - maybe - 9 cases out of

 

Your crazy flags system looks nice, judging from what I saw in VERIFYDD.G4B. A blocklist of a non-contigous file can be easily parsed if all comma's are replaced by a space or by 0A after redirected to memory. Only bookkeeping will be more complicated if infile AND outfile are both non-contigous and have different series of fragments. :unsure:



#18 Wonko the Sane

Wonko the Sane

    The Finder

  • Advanced user
  • 16066 posts
  • Location:The Outside of the Asylum (gate is closed)
  •  
    Italy

Posted 11 October 2021 - 09:48 AM

Well, I have some good news and some bad news.

 

The bad news first. :(

Going into the details of the non-easy situations (the ones where one or both files are fragmented and/or we have both a skip and seek, and they are not the same or are not multiples of blocks, etc.) matters becom,e really complicated, and long and we will hit some serious performance issues with largish files and we will have anyway a too small file size (and or amount of bytes transferred).

 

Now the good news. :)

 

We can use mapping to (rd) (not dd-ing to it) and use (or abuse) the new syntax of Fn.42 (the one I found out recently and couldn't find any use for) and the possibility (that I hadn't thought about till now) of piping into grub4dos built-in crc32.

 

We can map *any* file (or blocklist) to rd, i.e.:

map (fd0)/sectedit.g4b (rd)

the (rd) will be contiguous.

 

blocklist /sectedit.g4b

 (fd0) 2265+19, 2005+1, 2007+1, ....

 

call Fn.26 (fd0)/sectedit.g4b

calc *0x8320

30668 (HEX:0x77CC)

 

say that we have in map-status

ram_drive=0x7f, rd_base=0x27fe8000, rd_size=0x77CC

 

if we want a pseudo-crc of 3 bytes at offset 425 in sectedit.g4b we can:

set /A offset=0x27fe8000+425

set /A length=3

set /A lower=0

set /A higher=0

 

call Fn.42 %lower% %higher% %offset% %length%

 

call  Fn.42 0x00 0x00 0x27fe81a9 0x3

00000000: 62 79 74 ...

The output (bar the initial address) is the same as:

cat --hex --skip=425 --length=3 (fd0)/sectedit.g4b

000001A9: 62 79 74 ...

 but now we can control the first address bytes (that we set to 00).

so we can have:

 

call  Fn.42 0x1A9 0x00 0x27fe81a9 0x3

000001A9: 62 79 74 ...

 

and :

 

call  Fn.42 0x1A9 0x00 0x27fe81a9 0x3 | crc32

8163458d

 

 

cat --hex --skip=425 --length=3 (fd0)/sectedit.g4b | crc32

8163458d

 

The piping has seemingly no particular size limit (has to be tested of course) and seems like very fast, i.e.

call  Fn.42 0x0 0x00 0x27fe8000 0x77CC | crc32

2c54e53e

 

cat --hex (fd0)/sectedit.g4b | crc32

2c54e53e

 

Though these are not real crc32 of the files, but since they are the crc32 of a "same" output, let's call them "meta-crc32", they are good enough to check that the two files (or extents) are the same.

 

The limit (unless we manually enlarge the rd) is 4 GB, which is IMHO "more than enough".

 

So, back to the crazy flags system, we could leave the:

:ends222

and

:begins112

as they are, possibly adding the

:begins111

case, and these will use crc32 directly and have virtually no file size limit, and adopt for all the other cases the (rd)+Fn.42+meta-crc32 approach, with the 4 GB filesize limit.

 

Problems remaining/things to do (in no particular order):

#1 Users of older grub4dos versions (those that use the old Fn.42 syntax)

#2 A simple way to get (the information for the rd, right now I could only use a temp md sector, maybe there is a smarter way)

#3 Add the :begins111

#4 Put together the (rd)+Fn.42+meta-crc32 part of the script.

 

Now, #3 and #4 are just a matter of some time, #2 can be for the moment remain with use of (md)0x1F7FF+1, the issue remains #1, probably something like the -mem vs --mem check can be used, but let's leave this aside until the rest is done.

 

:duff:

Wonko 

 

P.S.: have to check if these positions in memory:

https://rmprepusb.co...SED_BY_GRUB4DOS

 

0x82D0 rd_base (0x82D4 has high word) calc *0x82d0 | set mem=

0x82D8 rd_size

 

are still accurate.



#19 Wonko the Sane

Wonko the Sane

    The Finder

  • Advanced user
  • 16066 posts
  • Location:The Outside of the Asylum (gate is closed)
  •  
    Italy

Posted 11 October 2021 - 04:15 PM

Ok.

 

EXPERIMENTAL version 0.01 attached.

 

EDIT: attachment removed, see post below.

 

:duff:

Wonko

 



#20 Wonko the Sane

Wonko the Sane

    The Finder

  • Advanced user
  • 16066 posts
  • Location:The Outside of the Asylum (gate is closed)
  •  
    Italy

Posted 13 October 2021 - 12:15 PM

TESTING version 0.02 attached.

 

This seems to be working as intended, needs to be tested and then we will talk about what to remove (or add) to the output.

 

Edit: removed, see version 0.03 a few posts below.

 

:duff:

Wonko



#21 deomsh

deomsh

    Frequent Member

  • Advanced user
  • 196 posts
  •  
    Netherlands

Posted 13 October 2021 - 07:10 PM

Oeps, a new version already :( I will try this version later.

 

I did some tests (after copying my 1GB test files using IDD.G4B) with VERIFYDD.G4B version 001, but I run into problems with mapping if 'if' was first sector of a device, for instance (hd0)0+1 or (hd0,0)0+1 and even with (hd0)63+1 because map tried to map all sectors. With files/ other blocklists no such problems.

 

Mapping to (rd) and finding rdbase and rdsize afterwards is really nice, very smart :worship:

 

First about using map instead of dd to 'fill' (rd): doesn't seem faster if buflen is big enough. Mapping a 1GB-file took about 6 seconds, same file with dd and buflen=64m about 3,5 seconds.

 

With following script I tested performance of crc32-ing cat --hex vs call Fn.42.

 

RDCRC32.G4B in FATTEXT.G4B.png

 

Good news for users of older grub4dos versions (see print-screen below).

 

LOOPTEST RDCRC32.G4B cat 0x0 0x6600 en Fn.42 0x0 0x6600 na mapping sectedit.g4b I.png

 

But I am afraid there is maybe bad news too. I found that after a certain amount of bytes, the CRC32 didn't change anymore. Although I used a big zip-file full of non-zero bytes first, I tested last version of SECTEDIT.G4B (version 0.8) too. Please study the last print-screen.

 

Hopefully I did something wrong, otherwise the results should be easily reproducible.

 

RDCRC32.G4B met sectedit.g4b vanaf full size=0x77C9 terug tot 0x65C9 en dan 0x6600 I.png

 

My experimental findings indicate that max is roughly 0x6600 bytes, so 51 sectors (in fact a little bit higher). I remembered my earlier findings of max 255 sectors piping to a call from a (md)-device (early grub4dos versions much less).

 

I made following calculations:

 

255 sectors x 512 bytes = 130.560 bytes

 

130.560 bytes / 51 sectors / 32 (cat --hex lines per sector) = 80 ! (chars)

 

BTW: If a cat --hex line is a bit shorter, this would explain the difference up to the upper value of 0x6665 bytes I observed. Also the limit of 255 sectors I mentioned earlier is about 16-32 bytes less.



#22 Wonko the Sane

Wonko the Sane

    The Finder

  • Advanced user
  • 16066 posts
  • Location:The Outside of the Asylum (gate is closed)
  •  
    Italy

Posted 14 October 2021 - 09:00 AM

Good catch on the (hd0)0+1 meaning "whole device".

 

This can be solved by forcing a rd-size command, i.e.:

map --rd-size=0x200 (hd0)0+1 (rd)

maps only the first sector.

 

Using map and not dd is not about speed, it is about using (maybe) *something different*, what the batch does is to compare a (map to (rd)) copy of the source against a (map to (rd)) copy of the destination, since the destination has been modified by a dd command, if you compare a (dd) copy of source against a (dd) copy of the destination, possible (hypothetical) bugs in dd may go unnoticed.

 

If the around 50 sectors limit is confirmed, I think we can go back to the "progressive meta-crc32 of crc32's" anyway 50*512=25600 bytes/chunk is much better than the previous idea of 16 bytes/chunk. :)

 

Hopefully (before or later) a new grub4dos version may introduce a crc32 allowing a range in bytes, for both blocklists and files something like:

crc32 <blocklist or file>,<start>:<length>

 

Attached version 0.03 that take cares (hopefully) of the "whole device" issue and - experimentally - uses 50 sectors chunks for the meta-crc (which actually is a meta-meta-crc now).

 

EDIT:removed attachment, v.004 a couple posts below

 

:duff:

Wonko



#23 deomsh

deomsh

    Frequent Member

  • Advanced user
  • 196 posts
  •  
    Netherlands

Posted 15 October 2021 - 09:26 PM

I tested VERIFYDD.G4B_v003 (I will skip v002).

Colors looks very nice. :)

Flags 222 and 10222 looks good (first two print-screens):

Verifydd_003 Flag 222 Good .jpeg Verifydd_003 Flag 10220 Good .jpeg

But there are problems with flags 20 and 220 (following two print-screens):

Verifydd_003 Flag 20 Bad.jpeg Verifydd_003 Flag 220 Bad.jpeg

In these pictures can be seen that VERIFYDD.G4B shows another hex-line, not the ones inside the files/ blocklist.

I digged a bit deeper and found that after a reboot rd_base was actually 0x0! Corresponds with first bytes of (md)0+1. If before a full file, or a blocklist accepted by map was used, rd_base was okay, but NOTHING NEW was mapped (see next print-screens).

Verifydd_003 map --rd-size BAD I.jpeg Verifydd_003 map --rd-size BAD II.png

It seems that using map --rd-size=SIZE blocked commands afterwards. This behavior is consistent with information in help map.

In case of files and 'accepted blocklists' --rd-size=SIZE is not really needed, but cases I mentioned in previous post are not 'mappable' (except as full device).

But I think I found a work-around, while playing with Limbo x86 during traffic.

I found out that map happily accepts a blocklist with a later block before an earlier block. So problem-cases like (hd0)0+1, (hd0,0)0+1 or (hd0)63+1 are accepted if before - for instance - 1+1, is given. Example in following print-screen.

map blocklist (hd0)1+1,0+1 (rd).png

Also I checked call Fn.42's attributes. First two values influence displayed memory starting value only, so I think they can stay at value 0x0 during CRC32-ing. With cat --hex this will not possible in general.

To illustrate possible approaches I made partial CRC32's from:

1) Copying boot-code from FAT12/16 PBR to a file (only first 512 bytes have to be skipped extra)

Map (hd0,0)1+1,0+1 and test.bin to (rd) and CRC32 with skip=62 bytes=446 .png

2) Two 100-sector files, before and after copying with skip=6 and seek=256.

Map file and CRC32 in parts I.png Map file and CRC32 in parts II.png

BTW: Print-screens are almost the same, just not enough lines to see the whole thing.

BTW2: unsure why sometimes map asked for --heads and --sectors-per-track, in general not needed.



#24 Wonko the Sane

Wonko the Sane

    The Finder

  • Advanced user
  • 16066 posts
  • Location:The Outside of the Asylum (gate is closed)
  •  
    Italy

Posted 16 October 2021 - 08:26 AM

Very good info/tests, thanks :).

 

From the bottom up:

1) map tries to use the geometry of the device image (which is what should be mapped to (rd), the default drive type for (rd) is a floppy, hence the 2/18 default, since that is only a warning and doesn't affect operations (and it is not shown with debug msg=0 and nul redirection) it should be irrelevant

2) that's the whole point of using Fn.42, being able to set to 0 the address, shown, no matter what the actual address is.

3) I'll add a check/provision for the case where the rd_base=0x0, I tested after a reboot and

map (md)0+2 (rd)

creates the "right" address for the (rd), this should only be needed if a single sector is dded/verified, but, see point 4 below:

4) what if - instead - we change in the map command, only the ending +1 into a +2?

 Needs to be tested, but it seems to work.  :unsure:

The affected subset is only (say) 0+1, 63+1, etc. but if we extend the correction to all cases where a single sector is involved (i.e. the device ends with +1) it shouldn't make any difference.

 

:duff:

Wonko

Attached Files



#25 deomsh

deomsh

    Frequent Member

  • Advanced user
  • 196 posts
  •  
    Netherlands

Posted 16 October 2021 - 01:11 PM

I tested v004.

About heads etc. seems to depend on filesize. For instance a 512 byte file is only mapped correctly with --heads=0 --sectors-per-track=0. Otherwise your script exits silently.

About +2: I tested already with a partition: not good, but for instance (hd0)+2 is good (but (hd0)63+2 is bad). I think because if map identifies a partition-start it tries to mount the partition first?

About Fn.42 0x0 0x0 etc. I thought you used %skip%, but now I see it was only in case of writing the first line to screen! So cat --hex can be used only in 16 bytes-chunks with bytes-only.




1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users