Hello,
I'm looking into ways of creating a fingerprint of an executable file (for e.g. Windows, but could also be Linux/Mac) based on its internal functions.
What do you think it would be a good strategy to make this possible?
Have some ideas:
1) extract function names (not sure if available)
2) get the code for the function, generate similarity hash of the content
3) other methods?
Right now option 2) seems to be on the spot for this task (assuming a plain executable without UPX or anything of the sort). After extracting the assembler code, it should be relatively straightforward to compare different functions to see if they are similar as done already for other binary comparisons.
Would you know any library (non-copyleft license) and hopefully running on Java that could extract the assembler snippet for each function or a better approach to this goal?
My thanks and happy 2017 ahead!