Since yesterday, each time I reboot my PC I get the following message right after the BIOS:
"WARNING: Dell's Disk Monitoring System has detected that drive 0 on the primary EIDE controller (the system drive) is operating outside of normal specifications. It is advisable to immediately backup your data and replace your hard disk drive by calling your support desk or Dell Computer Corporation.
STRIKE F1 to continue, F2 to run the setup utility"
So I press F1 and Windows boots fine and everything is normal. From within Windows I ran HDDLife and it reports that the system drive needs urgennt replacement, and the Western Digital Lifeguard Diagnostics shows a SMART Status Fail on the system drive, showing that the Re-Allocated sector count (ID#5) is the only SMART attribute showing trouble (Value 140, Threshold 140, Worst 140, Warranty 1).
My question is: could it be that the hard drive is 100% healthy and that this message is caused by some other problem in the PC? Or should I trust this message 100%? In the latter case, what are the implications? Does it mean that very soon the hard drive will crash?
In anycase I was frightened enough to completely back up all my data to a 2nd drive.
+ Reply to Thread
Results 1 to 29 of 29
-
-
Yup, you need to replace the hard drive. You've backed up all the data to another hard drive so now buy a new one. You're lucky because some computers don't have such monitoring system warnings other than the "death clicks of a hard drive".
-
Originally Posted by budz
Originally Posted by Soopafresh -
Let me explain what that really means.
In modern HDD's, there's a portion on the HDD that NOBODY writes to known as "reserve sectors".
If during the use of your HD, the drive detected a bad sector, it writes into its ROM that this so-and-so-sector is now actually reserve sector X, then copies the data out from the original sector, drops it into the reserve sector, and continue on as usual. To the outside, the HD was merely a little slow that time. While inside, it actually is doing substitutions on the fly. To outside, no bad sectors are visible, but the HD knows there are bad sectors.
Your pool of reserve sectors just ran out. It means your HD has done all it can to cover up its own errors. It had a LOT of bad sectors that it ran out of reserve sectors. From now on, you will LOSE DATA if it runs into any more bad sectors, and you will see them.
It should be WAY TOO SOON to exhaust the reserve sectors. Which is why a replacement is urgently needed.
Hope that helps. -
Kschang, very clear explanation. Now, let me ask you this: when a bad sector in the normal portion of the drive originated in the first place the drive did not "click" because as you say it transferred the data to a reserve sector and cancelled the original failed one. So, why is it that sometimes the drive directly "clicks" when a bad sector is created, being that it can transfer the sector to a reserve one? In other words, why is it that sometimes the SMART warning system fails to warn before failure?
(This hard drive is about 3 years old and I had more modern HD's that failed without any previous SMART warning)
And also, if a bad sector is created, why can't the drive transfer that bad sector to a healthy one in the normal secton of the drive without using the reserve sector? I never use more than 50% of the drive's capacity.
Finally, can a SMART error be corrected? In this link WD mentions a diagnostic tool to troubleshoot a SMART error:
http://wdc.custhelp.com/cgi-bin/wdc.cfg/php/enduser/std_adp.php?p_faqid=251&p_created=...i=&p_topview=1 -
The clicking sound is NOT always indicative of a failure. Basically, a "click" is the stepper motor controlling the read arm doing a "forced realignment". Some drives basically will do a full seek (read: sweep), then try to re-read the error sector, while others do "zero bounce" (hit outermost edge, then seek back in). It depends on the manufacturer's algorithms and specific implementation.
If a drive clicks a lot, it's likely doing a LOT of track realignment, and that could indicate trouble with the stepper motor itself, which is not really governed by SMART. On the other hand, some stepper motors make very little sound, PERIOD.
Only reserved sectors are used because those are guaranteed NOT to be in use, as they are NOT a part of any volume or partition. What you consider to be blank space is already allocated to a volume or a partition as far as the HD itself is concerned.
You can download tools to view detail statistics of SMART, and perhaps perform a low-level format which will zero-out all the bad sectors and start the disk fresh. However, in general that's not a good idea as bad sectors generally only GROW in size and numbers.
You have to keep in mind that data density is at an all-time high. Now we have HD's in 100-500 GB in size, where as 10 years ago we're 1/100th this size and maybe less. I still have my 40 MB Miniscribe HD in my old 286... Having a few KB or MB going bad in a platter of many GB's in size is actually not that bad when you consider how many HD's are made per day... It's not really fair to say that modern drives fail more often than older drives. -
Use Flobo to detect bad sectors (do not repair). Any noises like clicks (repeated) usually indicate that the drive is retrying a read/write to a particular sector. Use WD Windows Diagnostic Utility to repair (fast and repairs all makes). If after the repair you have clicks it's off to manuf. for replacement. In ALL cases of drive clicking I've had the drive eventually got messed up (got worse, NEVER better). Last Seagate I replaced had to be swapped twice due to clicking. In both cases Seagate thought that clicking was enough reason for replacement. Even though they've sent me a new one I kept using the existing drive to see if it stabilizes. Never happened. Clicking is bad news if your IDE cables are checked and verified OK (best is to replace cables right away).
-
Let me add regarding your last comments that this drive NEVER clicked or clicks, the sound is perfectly clean and normal all the time, yet SMART is reporting the error I described above. On the other hand my 2nd internal drive (which is a WD 320GB EIDE) clicked when I installed it fresh from factory, but the WD Diagnostic Tool was able to repair it and to date it has been performing flawlessly and it has been online 24/7 for 2 months already!. In fact, when I called WD for replacement of my 2nd internal drive, they said that if their diagnostic tool is able to repair it, then the drive is 100% reliable as if no clicking were present in the first place. They gave me the choice of having it replaced, but they said it was not necessary. I decided to give it a try before replacing it, but as said above so far so good.
-
Always swap cables if this happens (HDD error, whichever), scan with utilities and repair ASAP (image as well). These are my experiences (as posted). I'm looking now at the drive on the shelf 120 Gig WD that I hoped (clicking issue) I would be able to repair - did that 2-3 times before. Actually it died during repair. Froze and after reboot was useless. If connected, all it does is clicking every 4-5 secs trying to initialize. SMART error was also present as in your case (not exactly the same if I recall correctly). It was 4 mths past warranty and WD tech support were pricks (even though I called them before - the drive is unreliable, "try to repair or we will swap" they said, I chose repair thinking replacement not yet necessary, HDD worked until my warranty expired...
-
WHat do you mean by "swapping cables"? You mean the IDE cable that connects the hard drive to the IDE controller in the motherboard? If so, swap it with which one? The DVD drive one? I assume that you mean this because sometimes it's not the hard drive failing but the cable which is defective?
-
Precisely, get NEW cable(s). I recently went back from round cables to flat ones for that reason. The cost is minimal. At least replace the one on the suspected HDD bus. Bad, pinched cable, unreliable connector means bad sectors (besides the case of faulty HDD straight from the factory). It's that simple.
Aside from repair (I would highly recommend WD Windows utility - above, takes 1 hr to scan and repair 250 Gig HDD) low level format is the best way to re-certify it (20 hrs. or so). WD thing can do both (also SMART test). Image before (!) repair (bad sector repair and LL format bothe are a destructive process) just in case HDD (or data) doesn't survive the repair (rare but happened to me). -
Replace the IDE cable with a NEW one.
I've gone through this routine occasionally. Backup, swap in a previously cloned drive as master, set the bad drive as slave, do a low-level format, repartition, then use it for non-critical working files like encode/transcode. Often it just craps out again fairly quickly, so pitch it. But sometimes it's good again, for months or years, got a couple in this machine right now like that.
Can someone explain how that could be?Pull! Bang! Darn! -
Originally Posted by InXess
Originally Posted by InXess -
Google recently published a report on hard drive failures in their systems. Of course, their usage pattern is different than the average user but there's some interesting information in the report.
http://news.bbc.co.uk/2/hi/technology/6376021.stm
The report also looked at the impact of scan errors - problems found on the surface of a disc - on hard drive failure.
"We find that the group of drives with scan errors are 10 times more likely to fail than the group with no errors," said the authors.
•First of all, Mean Time Between Failure rates mean nothing.
•Secondly, SMART hardware monitoring missed 36% of all uh-ohs.
•Third, overworked drives fail similarly to standard drives after the first year.
•Fourth, Hard drive age means less than you think.
•Fifth, failure does not go up when temperatures are higher than usual (unless super high.)
http://216.239.37.132/papers/disk_failures.pdf -
Western Digitals were good up until around 4 years ago. I have a bin of dead hard drives from various clients over the years. 90% of them are Western Digitals. I suffered TWO dead drives in a raid 5 system...both drives - you guessed it....WDs . Both drives had consecutive serial numbers, which means it probably was a bad batch. Every drive is going to fail, but the Seagates have been more reliable than the WDs. Only sharing my experiences...
-
Originally Posted by alegator
If SMART error is caused by surface error (most of them are) then yes. Other possibilities are spindle and controller related errors and these are not repairable (HDD swap). Either way every SMART error HAS TO BE repaired ASAP one way or another. You have 2 options use the tool and/or swap the drive.
WD is off my list. 80% of my drives are Seagates plus 2 Maxtors. I find them more reliable. The only irreparable failures I got were exclusively on WD. Seagates have 5 year warranty and that means a lot to me. WD and Maxtor had 1 year or 3 years depending or where you buy (?? WTF?). Odd thing is that retail kit WD's (more expensive) were warranted for 1 year while cheaper OEM drives often had 3 (or 1) year warranty. Makes no sense whatsoever. -
I recently was getting read access hangs and SMART errors on an old 80GB Seagate drive and thought the disk was finally failing. This disk was in a removable disk tray in a new PC and I also saw some disk errors when I plugged another drive into that PC. It turns out that the disk tray connector must have had at least one bad connector/pin because the disk errors went away when I began making sure to insert the disk tray all the way into the connector.
I did some more investigation and found that the Seagate drive did indeed have data errors when I tried to use it in another PC. I reformatted the disk and now it is working fine again. That Seagate disk has been well used for more than 5 years and it is currently being used again with no errors.
Executive Summary: Bad connectors/cables can result in SMART errors on a disk even after the connectors/cables are repaired.
DonP -
This is just an update, I'm still using the same SMART fail hard drive on a 24/7 basis since I created this thread and it has been performing flawlessly. Of course I'm also backing up regularly just in case.
-
I have an old Samsung 20Gb HDD. And I had similar problem with it. I've just disabled S.M.A.R.T. monitoring and forgot about this incident (btw this hard disk still runs good in our day)
JustSayHi - #1 Free Dating Site | Free Samples -
ALWAYS try a new IDE cable with a suspect drive, they are cheap to free, takes seconds to replace, and often solves the problem. Simply removing and replacing the connectors (Don't forget the Power!) often solves problem.
Identifying and interpreting the noise patterns produced a drive requires some experience with large numbers of different drives, as these noise patterns are specific to differnet makes and models.
SMART errors, however, require no experience whatsoever. While SMART does miss some errors, I have never heard of it giving false positives.
A re-format and re-partition will sometimes solve these errors. However, remember that most drives carry their data on a thin layer of plating applied to the disk platters. These layers can crack and peel off much like chrome on a bumper. I have never seen, or heard of, chrome flaking off a bumper which somehow re-attaches itself and no more flaking occurs.
When bad spots appear, they are either few and static or many and steadily increasing. There is no fix for the second case, and continuing use will ABSOLUTELY GUARANTEE data loss at some point. With no additional sectors available, this data loss will be almost certainly unrecoverable, even under laboratory conditions. -
Originally Posted by Nelson37
Especially if all you've got to lose is all your data!
-
I used to have a 160GB and a 200GB HD on HD drawers for my media PC. One of them get Smart error and I remove it. Last week, I put the Smart failing drive in another HD drawer and plug it back in. It is working fine since.
I suspect a loose Molex HD connector maybe the cause. -
Loose connections on removable drive trays are an EXTREMELY COMMON problem, simple check is to eliminate the tray and connect the drive directly to the MOBO. IF that works OK, place the tray in the circular file.
-
I'm confused about an apparent contradiction. On the one hand you're saying that SMART errors never give a false positive, but on the other hand you say that sometimes the problem might be the data cable failing making the hard drive show a false SMART warning?
-
Smart diagnostics, and many other assume a set of conditions that the devices are operating in. A working and stable power supply is one of them.
Power supply is one of the common failure in a typical desktop PC. They powered everthings in the box, when they degraded and failed partially. The issues can appeared in the devices they are powering.
Harddisk consumes the most current on +12V lines, so a degraed or weaken +12V connection and put HD on the blink.
-12V is still used by serial mosue, and serial communication. So when mouse and RS232 failed, it will be good to look at the power supply.
There are many other examples..... -
Originally Posted by alegator
-- but I have, in my experience, never had a SMART error that did not precede a drive breakdown.
That is, by all means feel free to leave the drive on the system after replacing the cable and/or checking all the other things other people have mentioned (and it's a great idea to check all those things whenever you get a computer error), but I would strongly encourage you not to put critical data on that drive unless you've already got it backed up in another location.
Yes, sure, the drive may keep going, so if you want to leave it as a temp, or an experiment in how long it'll stay working, sure. But you might also wish to consider how much you gain from this from how much you stand to lose.If the drive is still under warranty, I'd send it back immediately. If it's not under warranty, I'll pretty much bet that the price of an equally sized new drive would be a lot less painful in the long run, compared to losing that drive right in the middle of editing two hours of video.
EDIT: Nelson37 and myself both came down on the side of replacing a drive that displays SMART errors, period. SingSing noted he had experienced other non-fatal errors that could lead to SMART error messages, but neither Nelson37 or me would take a chance on that. So there's no internal contradiction, just differing points of view. -
A SMART error related to a bad cable connection is not a false positive. There is an error which must be corrected, Just in this case the problem is external to the drive.
Correcting these issues usually involves a re-format and re-partition.
Care must be taken to make CERTAIN that the error is indeed corrected, this involves some extra monitoring of the suspect drive. -
Very good info ... okay WD drives are not number one.
I do have some in use ... I've got a SATA 500 GB in my main PC ... the one I'm typing this message. I've got a second one also but my Dell GX280 [bought on EBay] wasn't happy with it ... rebooted every time it accessed it ... so I put a smaller one in it ... 120 GB or 160 GB ... most likely a Seagate from BestBuy.
Seagate is on sale at Best Buy ... quite often.
What about Maxtor ?? ... now that they are part of Seagate.
Similar Threads
-
Less verbose mencoder but WITH status line
By darkpingu in forum Video ConversionReplies: 2Last Post: 8th Feb 2011, 17:58 -
Legal Status of Spawning FFMPEG
By ackava in forum ProgrammingReplies: 1Last Post: 16th Apr 2010, 11:30 -
Status of SVCD2DVD
By k_vic in forum SVCD2DVD & VOB2MPGReplies: 3Last Post: 4th Feb 2010, 19:35 -
DVDAuthorGUI - Status Access Violation
By ecc in forum Authoring (DVD)Replies: 13Last Post: 12th Dec 2007, 00:47 -
Member Status......doesnt work
By electricsguy in forum FeedbackReplies: 14Last Post: 25th Jun 2007, 10:21