Jump to content











Photo
- - - - -

S.M.A.R.T HD


  • Please log in to reply
14 replies to this topic

#1 erwan.l

erwan.l

    Platinum Member

  • Developer
  • 3041 posts
  • Location:Nantes - France
  •  
    France

Posted 17 December 2017 - 04:21 PM

Posted Image

File Name: S.M.A.R.T HD
File Submitter: erwan.l
File Submitted: 17 Dec 2017
File Updated: 17 Dec 2017
File Category: Tools

Yet another S.M.A.R.T tool.
Greatly inspired by Smartmontools.

SmartHD supports PATA, SATA and USB drives (e-sata not tested).
It does not support at this stage CSMI (raid) or NVME (pci-e) drives.

SmartHD will provide the folloding S.M.A.R.T details : identify, attributes & thresholds, self test logs.
It will also one to perform a self test : short or extended.
It will also give you the condition of the drive : OK, Warning or Critical.
It is native (i.e no dependency, runtime, etc) and is portable.

SmartHD can be minimized to the tray icon where the temperature and condition will be displayed.

SmartHD will refresh the details for the selected drive every 10 mns.

SmartHD can also export datas to a HTML file (right click in the main window).

Feedback, bug reports, feature requests welcome.

/Erwan

Click here to download this file

#2 erwan.l

erwan.l

    Platinum Member

  • Developer
  • 3041 posts
  • Location:Nantes - France
  •  
    France

Posted 17 December 2017 - 04:25 PM

Do2C5ts.png

 

hE7byHp.png

 

EUuCMpQ.png



#3 Wonko the Sane

Wonko the Sane

    The Finder

  • Advanced user
  • 16066 posts
  • Location:The Outside of the Asylum (gate is closed)
  •  
    Italy

Posted 17 December 2017 - 08:09 PM

It will also give you the condition of the drive : OK, Warning or Critical.

Oh, noes :frusty: ,  tu quoque ...  :w00t: :ph34r:
 
Only to give some context:
http://reboot.pro/to...-hdds/?p=205079
 
Maybe a "BBR[1] tab" (or OSM[2] tab) with only 5, 187, 188, 197, 198 AND a way to "sum" them could represent - even if not in any way an actual reliable prediction - at least an element of news when comparing this program to all the other similar tools that provide a mess of (mostly) meaningless values and no actual way to interpret them to have a vague idea of what is happening (still let alone what will happen in the future).

 

:duff:

Wonko
 
 
 
[1]BackBlazeRecommended
[2]Only Somewhat Meaningful



#4 erwan.l

erwan.l

    Platinum Member

  • Developer
  • 3041 posts
  • Location:Nantes - France
  •  
    France

Posted 17 December 2017 - 10:22 PM

Ouch, i have never been compared to Brutus before :)
SmartHD is using the BBR, thus making it a tab or in some way more visble is a good idea.

Actually this was my objective : bring the tool here anw see if i can bring something new thanks to our community feedback.

#5 Wonko the Sane

Wonko the Sane

    The Finder

  • Advanced user
  • 16066 posts
  • Location:The Outside of the Asylum (gate is closed)
  •  
    Italy

Posted 18 December 2017 - 11:29 AM

Ouch, i have never been compared to Brutus before :)
SmartHD is using the BBR, thus making it a tab or in some way more visble is a good idea.

Actually this was my objective : bring the tool here anw see if i can bring something new thanks to our community feedback.

Yep :), one of the main issues with S.M.A.R.T. technology is that it provides far too many "data points" (in a form that is either meaningless or of extremely difficult interpretation) and no software capable of - once excluded the totally irrelevant ones - computing the relevant ones and providing a (logical, educated) guess at what the future of the hard disk will be. (which actually is the ONLY thing the end-user actually wants[1] or needs[2]).

 

So the actual "new feature" of your nice program :) could be - while actually showing all the data available. like the n similar programs - select the only relevant one and interpret them "corrrectly", in a way that the end-user can actually have what he/she is looking for.

 

:duff:

Wonko

 

[1] here it is vital to understand WHY someone would WANT to know the future of the hard disk, as - in a perfect world - everyone should be ready for disaster recovery (that will happen anyway and, by definition, unexpectedly) 

 

[2] and this explains the NEED, since noone actually makes valid backups (and keeps them up to date) unless he/she is scared to death :ph34r: by some arbitrary prediction software saying that failure of the device is imminent



#6 erwan.l

erwan.l

    Platinum Member

  • Developer
  • 3041 posts
  • Location:Nantes - France
  •  
    France

Posted 18 December 2017 - 01:25 PM

Issue with S.M.A.R.T is that in about 40% of the cases (rough number from study such as the google one), the disk will just fail without S.M.A.R.T indicating anything wrong before the failure.

About this 40%, with latest modern disks (being less mechanical), I would not be surprised if that number actually decreases thus making S.M.A.R.T an even more reliable solution.

 

Now, if you take it the other way around, when S.M.A.R.T reports something wrong (and in particular one or more of the BBR attributes i.e 5, 187, 188, 197, 198), there is a high probability that your disk will fail shortly.

At this stage, I would personally recommend to clone your disk and replace your most-probably-faulty-disk (at best) or at least review your backup solution (at worse).

In my quality word, preventing always cost less than fixing the issue.

 

Even if it is not a 100% reliable solution, S.M.A.R.T is still a good solution (and the only one we have?) to minimize the risk of data loss (and time loss).

 

To increase your chances of catching the issue before it happens, the tool should run resident and perform offline (short and extended) tests regurlarly.

 

Much better if you ask me than the "flipism" method ;)

 

Last but not least, although there are N similar programs out there, not all them meet all (my) conditions : freeware, GUI, stay resident, perform offline self tests, notification...



#7 alacran

alacran

    Platinum Member

  • .script developer
  • 2710 posts
  •  
    Mexico

Posted 18 December 2017 - 02:34 PM

I'm agree with you this are very good conditions:

 

 

Last but not least, although there are N similar programs out there, not all them meet all (my) conditions : freeware, GUI, stay resident, perform offline self tests, notification...

 

I just had recently a 1 TB Seagate HDD failure, I use to check frecuently S.M.A.R.T attributes watching very close Current Pending Sector Count & Reallocated Sector Count wich are related, also temperature using aida64, but this time it failed suddently.  Worse thing is I was in the middle of depurating my 5 USB HDD's used for backup's and for that reason didn't make a new backup since about 2 months, it just cost me time (4 days) as I was able to recover all new info very slowly. I think some part of the HDD board fail, so started looking for a logical cause and din't find any but latter it came to my mind +12 V and +5 V are also very important for HD and may vary under heavy load.   I was running wimlib-imagex to transform an *.esd to *.wim when HDD failed.

Are this voltages involved in some of the test or test are only about reading and wring to disk?

 

alacran



#8 Wonko the Sane

Wonko the Sane

    The Finder

  • Advanced user
  • 16066 posts
  • Location:The Outside of the Asylum (gate is closed)
  •  
    Italy

Posted 18 December 2017 - 02:43 PM

Much better if you ask me that the "flipism" method  ;)[/size]

This is where your math fails. :w00t: :ph34r: , I would say that Flippism has a consolidated, demonstrated, good-everywhere and for anything accuracy of 50% (and thus a failure rate of the other 50%), whilst the accuracy of the interpretation of SMART data has been - since day one - over or mis-estimated.
 
If you take as reference the same (nowadays old/outdated) google reknown study:
https://research.goo...s/pub32774.html

After our initial attempts to derive such models
yielded relatively unimpressive results, we turned to the
question of what might be the upper bound of the accu-
racy of any model based solely on SMART parameters.
Our results are surprising, if not somewhat disappoint-
ing. Out of all failed drives, over 56% of them have no
count in any of the four strong SMART signals, namely
scan errors, reallocation count, offline reallocation, and
probational count. In other words, models based only
on those signals can never predict more than half of the
failed drives.

And even if you give no "weight" to each single SMART parameter:

even when we add
all remaining SMART parameters (except temperature)
we still find that over 36% of all failed drives had zero
counts on all variables.

so "logical" evaluating of the SMART parameters have an accuracy that is 44% (less than 50% ) and "extremely strict, weightless" evaluation is still at the most 64%.
 
And remember this is just statistics of an extremely large sample of disk drives used by professionals, in controlled environments, with peculiar usage, surely a very different situation from how end-user disk drives are used and maintained.
 
Data available the previous report from BackBlaze:
https://www.backblaz...ve-smart-stats/
clearly tells us that some of the data (as an example) is pretty much meaningless and the new data from BackBlaze:
https://www.backblaz...drive-failures/
 tells us that (still statistically) that a few parameters (when combined seemingly increase the accuracy to levels around 75-77%:

Failed drives with one or more of our five SMART stats greater than zero – 76.7%

That means that 23.3% of failed drives showed no warning from the SMART stats we record. Are these stats useful? I’ll let you decide if you’d like to have a sign of impending drive failure 76.7% of the time.

Which starts to have some relevance :) but in itself greatly contradicts the results of the google study :frusty:, it is very possible that newer disk drives are more "sensible" or (for all we know) the manufacturer may have added the random generation of a counter increase just because they could, and in any case:

But before you decide, read on.

Having a given drive stat with a value that is greater than zero may mean nothing at the moment. For example, a drive may have a SMART 5 raw value of 2, meaning two drive sectors have been remapped. On it’s own such a value means little until combined with other factors. The reality is it can take a fair amount of intelligence (both human and artificial) during the evaluation process to reach the conclusion that an operational drive is going to fail.

 
In any case, with percentages as low as 75% accuracy, Bayes' Theorem is lurking around :w00t: waiting patiently to intervene on the reliability of the test (when administered to your single case), the reknown example (from Innumeracy by John Allen Paulos) :
 

 

 

An interesting elaboration on the concept of conditional probability is known as Bayes'
theorem, first proved by Thomas Bayes in the eighteenth century. It's the basis for the
following rather unexpected result, which has important implications for drug or AIDS testing.
Assume that there is a test for cancer which is 98 percent accurate; i.e., if someone has
cancer, the test will be positive 98 percent of the time, and if one doesn't have it, the test will
be negative 98 percent of the time. Assume further that .5 percent— one out of two hundred
people—actually have cancer. Now imagine that you've taken the test and that your doctor
somberly informs you that you've tested positive. The question is: How depressed should you be?
The surprising answer is that you should be cautiously optimistic. To find out why, let's look at
the conditional probability of your having cancer, given that you've tested positive.
Imagine that 10,000 tests for cancer are administered. Of these, how many are positive? On
the average, 50 of these 10,000 people (.5 percent of 10,000) will have cancer, and so, since 98
percent of them will test positive, we will have 49 positive tests. Of the 9,950 cancerless people, 2
percent of them will test positive, for a total of 199 positive tests (.02 x 9,950 = 199). Thus, of
the total of 248 positive tests (199 + 49 = 248), most (199) are false positives, and so the
conditional probability of having cancer given that one tests positive is only 49/248, or about 20
percent! (This relatively low percentage is to be contrasted with the conditional probability
that one tests positive, given that one has cancer, which by assumption is 98 percent.)

should tell you something .... :whistling:

 
:duff:
Wonko



#9 erwan.l

erwan.l

    Platinum Member

  • Developer
  • 3041 posts
  • Location:Nantes - France
  •  
    France

Posted 18 December 2017 - 02:49 PM

Are this voltages involved in some of the test or test are only about reading and wring to disk?

 

alacran

 

 

Found on serverfault (and documented on t10.org but could not find the link for now).

 

The details of the tests can be read in eg missing/broken url), which summarises the elements of the short and long tests thus:

  1. an electrical segment wherein the drive tests its own electronics. The particular tests in this segment are vendor specific, but as examples: this segment might include such tests as a buffer RAM test, a read/write circuitry test, and/or a test of the read/write head elements.

  2. a seek/servo segment wherein the drive tests it capability to find and servo on data tracks. The particular methodology used in this test is also vendor specific.

  3. a read/verify scan segment wherein the drive performs read scanning of some portion of the disk surface. The amount and location of the surface scanned are dependent on the completion time constraint and are vendor specific.

  4. The criteria for the extended self-test are the same as the short self-test with two exceptions: segment (3) of the extended self-test shall be a read/verify scan of all of the user data area, and there is no maximum time limit for the drive to perform the test.



#10 Wonko the Sane

Wonko the Sane

    The Finder

  • Advanced user
  • 16066 posts
  • Location:The Outside of the Asylum (gate is closed)
  •  
    Italy

Posted 18 December 2017 - 02:57 PM

 I think some part of the HDD board fail, so started looking for a logical cause and din't find any but it came to my mind +12 V and +5 V are also very important for HD and may vary under heavy load.

Are this voltages involved in some of the test or test are only about reading and wring to disk?

 

A good guess :thumbsup: .

 

Most probably you (like everyone else BTW) are using low cost, crappy, power supplies (like most power supplies are nowadays) connected to a "simple" (like almost everyone else) "mains" plug  while clearly the good guys at google and at backblaze most probably have "better" hardware (for power supplies) and surely have "filtered", "conditioned" mains, so, even if they had (and published) the data it would be of no use for us "common mortals".

 

Totally unrelated, but a single anecdote that may be of comfort on the utter unpredictability of events:

 

http://reboot.pro/to...running-247365/

 

:duff:

Wonko



#11 Wonko the Sane

Wonko the Sane

    The Finder

  • Advanced user
  • 16066 posts
  • Location:The Outside of the Asylum (gate is closed)
  •  
    Italy

Posted 18 December 2017 - 03:02 PM

 

Found on serverfault (and documented on t10.org but could not find the link for now).

 

This?

https://serverfault....-it-work/732430

 

The reference should be to this:

www.t10.org/ftp/t10/document.99/99-179r0.pdf

more than this:

http://www.t13.org/D...al/e01137r0.pdf

 

:duff:

Wonko



#12 erwan.l

erwan.l

    Platinum Member

  • Developer
  • 3041 posts
  • Location:Nantes - France
  •  
    France

Posted 18 December 2017 - 04:20 PM

This?

https://serverfault....-it-work/732430

 

The reference should be to this:

www.t10.org/ftp/t10/document.99/99-179r0.pdf

more than this:

http://www.t13.org/D...al/e01137r0.pdf

 

:duff:

Wonko

 

Exactly !

Was planning and digging more later today but as always you "own" the internet :)



#13 Wonko the Sane

Wonko the Sane

    The Finder

  • Advanced user
  • 16066 posts
  • Location:The Outside of the Asylum (gate is closed)
  •  
    Italy

Posted 18 December 2017 - 07:03 PM

Exactly !

Was planning and digging more later today but as always you "own" the internet :)

Naahh, I am only pretty fast in drawing links, it's hard to catch it the first time ;) :

hxxp://www.youtube.com/watch?v=HTmVLHXn3H4&t=1m50s

 

:duff:

Wonko



#14 erwan.l

erwan.l

    Platinum Member

  • Developer
  • 3041 posts
  • Location:Nantes - France
  •  
    France

Posted 18 December 2017 - 08:22 PM

Naahh, I am only pretty fast in drawing links, it's hard to catch it the first time ;) :

hxxp://www.youtube.com/watch?v=HTmVLHXn3H4&t=1m50s

 

:duff:

Wonko

 

"Lo chiamavano Trinità" :)

A (italian) spaghetti western which was being broadcasted at every christmas when I was a kid !

Does not make us any younger.



#15 Wonko the Sane

Wonko the Sane

    The Finder

  • Advanced user
  • 16066 posts
  • Location:The Outside of the Asylum (gate is closed)
  •  
    Italy

Posted 18 December 2017 - 08:43 PM

"Lo chiamavano Trinità" :)

A (italian) spaghetti western which was being broadcasted at every christmas when I was a kid !

Does not make us any younger.

Sure it does not :(, though the real issue is that when they started broadcasting it at every Christmas, I wasn't a kid anymore :eek: .

 

:duff:

Wonko






0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users