Jump to content











Photo
- - - - -

Remedium - testing the platform


  • Please log in to reply
25 replies to this topic

#1 Brito

Brito

    Platinum Member

  • .script developer
  • 10616 posts
  • Location:boot.wim
  • Interests:I'm just a quiet simple person with a very quiet simple life living one day at a time..
  •  
    European Union

Posted 22 June 2011 - 02:36 PM

Hello,

I am working on a security platform called "remedium" and would like to share it with the community at reboot. This platform hosts different applications that expose and prevent malicious activities.

If you have some time to help with feedback, I would be deeply grateful.. :dubbio:

This is the initial beta, at this moment you can only see two demonstrations of the sentinel application in place:
- Index all files inside your computer
- Immunize USB flash drives when inserted in the computer

Below is a screenshot of remedium in action.
Posted Image


Remedium works across Windows, Linux (tested in Ubuntu) and MacOSX. I am including the .exe file that can be run directly from explorer. For other operative systems you should launch the executable from command line using "java -jar remedium.exe".

When launching from command line you get access to the log messages, please use the command line when testing remedium.


On this test you should be able of completing the index process. If some problem is output on the log, please do let me know on this topic.

You can download the binary from http://remedium.googlecode.com

-----------

The indexing of files allows to create a database of files that are found on your machine. In the future, it is intended that this information can be merged with the information from other workstations on a given network. The idea is to assign a score on files that are considered of trust or not.

After enough information is gathered, we can run metrics on the collected information. For example, if a kernel32.dll file is modified by a malicious process, we should be able of detecting that no similar file from Microsoft existed before and that this be treated as a suspicious event (more details on this algorithm will be explained later).

Thank you for helping!

:)

Attached Files



#2 Icecube

Icecube

    Gold Member

  • Team Reboot
  • 1063 posts
  •  
    Belgium

Posted 22 June 2011 - 05:12 PM

Shouldn't the report display hashes instead of ashes?

MD5 ashes
CRC32 ashes
SHA1 ashes
SHA2 ashes

It uses a lot of memory (around 175MB), when it is only started (no indexing).

It doesn't create /autorun.inf/con and /autorun.inf/Nul.protected directories (only /autorun.inf), at least on Linux.

#3 Wonko the Sane

Wonko the Sane

    The Finder

  • Advanced user
  • 16066 posts
  • Location:The Outside of the Asylum (gate is closed)
  •  
    Italy

Posted 22 June 2011 - 05:23 PM

Shouldn't the report display hashes instead of ashes?

Well :dubbio:, in portuglish there are quite a bit of "mute h"'s :)
http://en.wikipedia.org/wiki/H

I volunteer to re-add the missing h's by hand IF Nuno manages to bring down the mamory memory use to anything "human" (with the "h").
I'll add an h (when needed) every 5 Mb of memory requirement reduced....

:cheers:
Wonko

#4 Holmes.Sherlock

Holmes.Sherlock

    Gold Member

  • Team Reboot
  • 1444 posts
  • Location:Santa Barbara, California
  •  
    United States

Posted 22 June 2011 - 06:33 PM

............... down the mamory use to anything "human" ............


Posted Image

#5 Brito

Brito

    Platinum Member

  • .script developer
  • 10616 posts
  • Location:boot.wim
  • Interests:I'm just a quiet simple person with a very quiet simple life living one day at a time..
  •  
    European Union

Posted 22 June 2011 - 06:37 PM

It uses a lot of memory (around 175MB), when it is only started (no indexing).

Thank you. I had not tested for memory usage yet, a lot of optimization can certainly take place.


@wonko, thank you for volunteering. Looking at your link I see that you are right about the lack of H being fine for portuglish speakers:

In Spanish and Portuguese, ‹h› is a silent letter with no pronunciation

(hadn't noticed that detail..)

It doesn't create /autorun.inf/con and /autorun.inf/Nul.protected directories (only /autorun.inf), at least on Linux.

Ok, will add them on the next version.


Have you completed the indexing of files ok?

Any other issues noted?

#6 Icecube

Icecube

    Gold Member

  • Team Reboot
  • 1063 posts
  •  
    Belgium

Posted 22 June 2011 - 07:39 PM

Have you completed the indexing of files ok?

No. Only 1000 files are indexed.
Remedium uses to much memory and my Linux box is swapping to disk, so I stopped it.

I also don't know how useful it is, to index linux files.

Any other issues noted?

Not sure if it is an issue:

file names 1502
win32 files 0
directories 183
MD5 ashes 1024
CRC32 ashes 1022
SHA1 ashes 1024
SHA2 ashes 1025
Received files 60000
Processed files 2565
On queue to process 57358

The "processed files" and the "file names" have a different number.

It would be nice, if you could see the list of indexed files (checksum, version numbers, ...)
Also where is the database with all checksums stored? I only can find database.log, database.properties and database.script files (some of them are exactly the same in the different folders).

#7 Wonko the Sane

Wonko the Sane

    The Finder

  • Advanced user
  • 16066 posts
  • Location:The Outside of the Asylum (gate is closed)
  •  
    Italy

Posted 22 June 2011 - 07:50 PM

@Holmes.Sherlock
That's malagasish :ranting2: :w00t: :thumbup:
http://www.websters-...Malagasy/mamory

Ok, you got me: typo! :cheers:

:cheers:
Wonko

#8 Brito

Brito

    Platinum Member

  • .script developer
  • 10616 posts
  • Location:boot.wim
  • Interests:I'm just a quiet simple person with a very quiet simple life living one day at a time..
  •  
    European Union

Posted 22 June 2011 - 08:40 PM

No. Only 1000 files are indexed.
Remedium uses to much memory and my Linux box is swapping to disk, so I stopped it.

It indexes files on lots of 500 so you only allowed it to process two cycles.

What are the specs of your machine?


I also don't know how useful it is, to index linux files.

Albeit not common, the attack vector used on Windows remains valid for other operative systems. For the moment I would also like to test the storage limits of remedium so the more indexed files, the better.

Theoretically, the database can support up to 16Gb worth of data records.


The "processed files" and the "file names" have a different number.

Processed means that the file was "processed" by remedium, then the result can be:
- Store the file name (if it is not a duplicated)
- Ignore the file if it is bigger than a size threshold (10Mb for now)
- Add checksum calculations to each container (if not duplicated)

These are the reasons why those numbers are different.


It would be nice, if you could see the list of indexed files (checksum, version numbers, ...)

Yes, I intend to add that feature. The goal is to add a search box to find information about a given file name or hash (when it was first indexed, level of trust, comments and so on). For now we are just testing the local machine indexing part.


Also where is the database with all checksums stored? I only can find database.log, database.properties and database.script files (some of them are exactly the same in the different folders).

Each folder represents a different component active on the platform. Databases are isolated from each other, this is better described at http://nunobrito1981...-isolation.html

:cheers:

#9 Icecube

Icecube

    Gold Member

  • Team Reboot
  • 1063 posts
  •  
    Belgium

Posted 23 June 2011 - 02:34 PM

It seems to stop after processing 1501/2 files names (2 tests).

What are the specs of your machine?

  • 976MB RAM
  • ADM Athlon™ 64 X2 Dual Core Processor 5000+


#10 Brito

Brito

    Platinum Member

  • .script developer
  • 10616 posts
  • Location:boot.wim
  • Interests:I'm just a quiet simple person with a very quiet simple life living one day at a time..
  •  
    European Union

Posted 23 June 2011 - 03:17 PM

Ok, thanks for sharing.

If you look at the log there is usually an explanation of why it has stopped.

#11 Icecube

Icecube

    Gold Member

  • Team Reboot
  • 1063 posts
  •  
    Belgium

Posted 23 June 2011 - 03:52 PM

Ok, thanks for sharing.

If you look at the log there is usually an explanation of why it has stopped.

Serving of the image file, still continues:
[rem-10101][file][info] Delivering sentinel_stats.png to localhost.localdomain
There are some errors, but the program continues, so that should be the problem (a whole lot more files can't be read):
[rem-10101][sentinel/indexer][error] addToContainers operation failed: Can't read '/proc/irq/0/smp_affinity'

[rem-10101][sentinel/indexer][error] addToContainers operation failed: Can't read '/proc/irq/default_smp_affinity'


#12 Brito

Brito

    Platinum Member

  • .script developer
  • 10616 posts
  • Location:boot.wim
  • Interests:I'm just a quiet simple person with a very quiet simple life living one day at a time..
  •  
    European Union

Posted 24 June 2011 - 12:20 PM

[rem-10101][file][info] Delivering sentinel_stats.png to localhost.localdomain
Yes, this is normal. You have the browser window open at the sentinel page so it will fetch an updated stats image while the scan is progressing.

There are some errors, but the program continues, so that should be the problem (a whole lot more files can't be read):

Ok, there are files that can't indeed be read. Did some corrections on the indexer and will post a new version soon. Haven't tested yet on Linux but on the previous test using Ubuntu it was indexing the whole file system and ignoring the files that couldn't be read.



----

Found the culprit for the large memory usage: HSQL.

Typically, a java program will only use one instance of HSQL, consuming typically ~50Mb of RAM. Remedium employs on HSQL instance per component and there over 10 components active in the platform right now.. :cheers:

Also, every request to the method count() on each container is consuming RAM (some memory leak might be present). This seems to stall at around 500Mb~700Mb of used RAM in overall. It is not a problem on my laptop with 4Gb of RAM but it is certainly an issue at machines with less RAM available.


At the moment, only the message queue and indexer use the database resources. On my tests from 2010, the indexing process when using SQLite was consuming 10Mb of RAM in overall. The downside is that SQLite can't handle parallel connections nor scale to manage gigabytes of information per database. I'm sure we can drop the RAM usage significantly.

Right now I'm a bit more worried about the core functionality, RAM usage decrease will follow.

:cheers:

#13 Icecube

Icecube

    Gold Member

  • Team Reboot
  • 1063 posts
  •  
    Belgium

Posted 24 June 2011 - 12:35 PM

Oops, I made a typo. I meant:

There are some errors, but the program continues, so that shouldn't be the problem (a whole lot more files can't be read):

Good that you found some problems.

#14 Wonko the Sane

Wonko the Sane

    The Finder

  • Advanced user
  • 16066 posts
  • Location:The Outside of the Asylum (gate is closed)
  •  
    Italy

Posted 24 June 2011 - 12:39 PM

Right now I'm a bit more worried about the core functionality, RAM usage decrease will follow.

I op this will appen soon, I would dare say it's ard to use SQL :cheers: on machines with less RAM (and no aitches).

As a sign of good will, here are a few, missing in the above:
hhhHh1

:cheers:
Wonko

[1] You are only allowed to use the scarlet letter :cheers: if you are Irish Catholic or Australian (or BOTH).
If you are British and born after 1982, you may use it, but you might be asked for a special permit issued by Her Majesty the Queen

#15 Brito

Brito

    Platinum Member

  • .script developer
  • 10616 posts
  • Location:boot.wim
  • Interests:I'm just a quiet simple person with a very quiet simple life living one day at a time..
  •  
    European Union

Posted 24 June 2011 - 03:21 PM

The project is now hosted in google code at http://code.google.com/p/remedium/

Source code not included yet, still need to work on the release license.

#16 Brito

Brito

    Platinum Member

  • .script developer
  • 10616 posts
  • Location:boot.wim
  • Interests:I'm just a quiet simple person with a very quiet simple life living one day at a time..
  •  
    European Union

Posted 24 June 2011 - 04:46 PM

It doesn't create /autorun.inf/con and /autorun.inf/Nul.protected directories (only /autorun.inf), at least on Linux.

The new version should add the mentioned folders. However, I am not being to test this under Windows. Any other way of creating these folder in Windows?

At least I hope that in Linux this shouldn't be an issue.

#17 Brito

Brito

    Platinum Member

  • .script developer
  • 10616 posts
  • Location:boot.wim
  • Interests:I'm just a quiet simple person with a very quiet simple life living one day at a time..
  •  
    European Union

Posted 25 June 2011 - 02:03 PM

Version 1.0.0.1 available at google code: http://code.google.c...m_110625-JUN.7z

Log of changes:

- Added "con" and "Nul.protected" folders to autorun.inf folder when immunizing a pendisk under a Linux/Mac environment
- Corrected typo "ashes" to "hashes"
- Corrected defect on sentinel indexer that didn't allowed viewing progress of indexing from a remote machine
- Modified the raw Win32 file reader to output the file name in case of error
- Corrected raw Win32 defect when reading executable "DivX Plus Player.exe" at image_nt_header class. The number of resources is not correctly handled. Added hack to circumvent issue but it is not solved
- Internet browser window is only open after the sentinel indexer has started. When indexer has considerable ammount of data stored, the initial startup is slow and caused browser to display an offline page
- Added the getRootFolder() to the utils.files class. This method helps to get "c:\" under Windows or "/" under Unix
- Renamed several package names (apps -> apps_core, remedium.system -> system, ...) to provide a more organized structure


This addresses some of the reported defects (and typos) noted on the previous version. Please test and let me know if it will index all files on your machines without getting stuck.

:)

#18 Icecube

Icecube

    Gold Member

  • Team Reboot
  • 1063 posts
  •  
    Belgium

Posted 27 June 2011 - 09:57 AM

Still no improvement:

Data collected
file names 1501
win32 files 0
directories 179
MD5 hashes 856
CRC32 hashes 854
SHA1 hashes 856
SHA2 hashes 856

Operations
Received files 60000
Processed files 2562
On queue to process 57361
Average files per minute 0

I started from a clean directory.

OS: Fedora 15 64-bit

#19 Brito

Brito

    Platinum Member

  • .script developer
  • 10616 posts
  • Location:boot.wim
  • Interests:I'm just a quiet simple person with a very quiet simple life living one day at a time..
  •  
    European Union

Posted 27 June 2011 - 01:34 PM

Thank you for testing.

Would you please provide the log messages that are visible on the command line?

This might help to get a better idea of what is getting the index process stuck.

I'm now working on memory usage to keep remedium running at acceptable RAM levels. On my machine it reached 1Gb when indexing 100 000 files, the current HSQL approach will not scale to a single million on the current state, let alone expect to aggregate millions of records one day in the future.

Will see if a prototype with a NoSQL approach can be made available this week.

:cheers:

#20 Icecube

Icecube

    Gold Member

  • Team Reboot
  • 1063 posts
  •  
    Belgium

Posted 27 June 2011 - 02:08 PM

The whole log is attached.

Attached Files



#21 Brito

Brito

    Platinum Member

  • .script developer
  • 10616 posts
  • Location:boot.wim
  • Interests:I'm just a quiet simple person with a very quiet simple life living one day at a time..
  •  
    European Union

Posted 27 June 2011 - 05:30 PM

Thank you. The log does not indicate clearly what happened.

I guess that a more detailed log reporting tool is required to add on the system.. :dubbio:

Once again, thank you for testing.

:yahoo:

#22 Brito

Brito

    Platinum Member

  • .script developer
  • 10616 posts
  • Location:boot.wim
  • Interests:I'm just a quiet simple person with a very quiet simple life living one day at a time..
  •  
    European Union

Posted 01 July 2011 - 11:52 AM

The source code for remedium has been made available for those interested in following the implementation of this platform.

At the moment, I am working on a replacement for the Container class with a new class entitle ContainerFlatFile that will use normal files instead of HSQL.

Initial progress and documentation are being written here: http://code.google.c...ntainerFlatFile

If you have any comments, please post them here.

Thank you.

:)

#23 Brito

Brito

    Platinum Member

  • .script developer
  • 10616 posts
  • Location:boot.wim
  • Interests:I'm just a quiet simple person with a very quiet simple life living one day at a time..
  •  
    European Union

Posted 02 July 2011 - 12:31 PM

Recent commit of changes:

  • Added LogMessage class, allows classes that require using the log functions
    from a component to become independent from components. Log messages became
    objects that can be handled without need for components.
  • Added ContainerFlatFile, replaces the HSQL container that is default on
    remedium to reduce the RAM usage when indexing a large of data
  • Created the specification document for the ContainerFlatFile
    (http://code.google.c...ntainerFlatFile)
  • Added LogRecord class and respective test case. Allows to store several log
    record objects inside a log message class.
  • Modification on SaveStringToFile(File inputFile, String inputString).
    inputFile is now a file object instead of String to prevent ambiguities.
    Developers could mistake accidentaly the order of parameters
  • Added INIFile class along with respective Test case and initial documentation
  • Added CodingRules.wiki to our Wiki. This page provides tips about coding for
    developers
  • Added new wiki pages (CodingRules.wiki, INIfile.wiki)
  • Modified ContainerFlatFile.wiki
  • Skeleton added for LogMessage.wiki

These changes are not worthy of a new version release since they don't modify the current functionality.

The current intention is to replace the Container class based on HSQL with a new Container class based on Flat files. To achieve this goal it is also necessary to create a class to handle configuration files in INI format.

Creating the INI class is not an obstacle, just takes some time to implement. One could adopt an already made INI class but there is also interest in creating a customized version so that it can add other features non-standard in INI like adding/removing lines and improvements to deal with large sized INI style files as seen on .script files. Functionality is being added "on-demand" to this class.

Also created a LogMessage class, really handy class to pass onto these new classes without the need to start a whole remedium structure just to use the logging facilities. We provide the log object and it will collect the log messages. It is minimally functional, should be improved over the next years.

Documentation is being added on the Wiki, I like the way how I can edit the pages offline on my laptop and then upload them using SVN when ready. This is really a neat feature on google code, if anyone wishes to help with the coding or development effort, drop me a message.

:whistling:

#24 Brito

Brito

    Platinum Member

  • .script developer
  • 10616 posts
  • Location:boot.wim
  • Interests:I'm just a quiet simple person with a very quiet simple life living one day at a time..
  •  
    European Union

Posted 09 July 2011 - 11:45 PM

An update on the current progress.

I'm still working to replace the HSQL container with a flat file version, these are the current performance results:
Write operation took 21 seconds to write 10 000 records

Write operation took 27 minutes and 21 seconds to write 100 000 records


Memory usage goes from 18Mb to short peaks of 300Mb. Java is not shy about using memory so I'm working to slim this value even more. In the good old days, a program doing such feature should have no reason to use more than 2Mb of RAM.. :thumbsup:

It takes 27 minutes to index 100 000 records, the number of time it takes to write a record will increase as more records exist on the database, but memory consumption remains constant as intended for this approach. My personal goal is to index the same 100 000 records under 10 minutes.

#25 Brito

Brito

    Platinum Member

  • .script developer
  • 10616 posts
  • Location:boot.wim
  • Interests:I'm just a quiet simple person with a very quiet simple life living one day at a time..
  •  
    European Union

Posted 10 July 2011 - 06:57 AM

An update on the current progress.

I'm still working to replace the HSQL container with a flat file version, these are the current performance results:
Write operation took 21 seconds to write 10 000 records

Write operation took 27 minutes and 21 seconds to write 100 000 records


Memory usage goes from 18Mb to short peaks of 300Mb. Java is not shy about using memory so I'm working to slim this value even more. In the good old days, a program doing such feature should have no reason to use more than 2Mb of RAM.. :smiling9:

It takes 27 minutes to index 100 000 records, the number of time it takes to write a record will increase as more records exist on the database. Good news is that memory consumption remains does not increase regardless of scale.

My personal goal is to somehow index the same 100 000 records under 10 minutes.. :thumbsup:




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users