Hard drive failing or kernel problem?
Posted: Fri Jul 07, 2006 2:27 pm
I've noticed increasing oddities with my server (1and1 server). Yesterday my log parser returned a few of these
I've suspected I have a problem for some time now and my SMART scan seems to indicate my HD is nearly 3 years old. The problem is I can't figure out if this is dying or just an indicator that I need to bite the bullet and reimage.kernel: ide: failed opcode was: unknown
Is this a kernel problem as some sites have suggested? Would going to the ASL kernel fix it? Running FC2 currentlysmartctl -a /dev/hda
smartctl version 5.21 Copyright (C) 2002-3 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF INFORMATION SECTION ===
Device Model: IC35L040AVVN07-0
Serial Number: VNP214B2SUPYAE
Firmware Version: VA2OAG0A
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: 5
ATA Standard is: ATA/ATAPI-5 T13 1321D revision 1
Local Time is: Fri Jul 7 11:48:56 2006 CDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x84) Offline data collection activity was
suspended by an interrupting command from host.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (1177) seconds.
Offline data collection
capabilities: (0x1b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
No Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
No General Purpose Logging support.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 20) minutes.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 089 089 060 Pre-fail Always - 1638446
2 Throughput_Performance 0x0005 145 145 050 Pre-fail Offline - 278
3 Spin_Up_Time 0x0007 110 110 024 Pre-fail Always - 147 (Average 153)
4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 14
5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 1
7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0
8 Seek_Time_Performance 0x0005 142 142 020 Pre-fail Offline - 28
9 Power_On_Hours 0x0012 097 097 000 Old_age Always - 26252
10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 7
192 Power-Off_Retract_Count 0x0032 100 100 050 Old_age Always - 879
193 Load_Cycle_Count 0x0012 100 100 050 Old_age Always - 879
194 Temperature_Celsius 0x0002 134 134 000 Old_age Always - 41 (Lifetime Min/Max 15/49)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 1
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 2
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 2
199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0
SMART Error Log Version: 1
ATA Error Count: 32 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Timestamp = decimal seconds since the previous disk power-on.
Note: timestamp "wraps" after 2^32 msec = 49.710 days.
Error 32 occurred at disk power-on lifetime: 26232 hours
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 08 0f 23 27 e2
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Timestamp Command/Feature_Name
-- -- -- -- -- -- -- -- --------- --------------------
c8 00 08 0f 23 27 e2 00 4208198.116 READ DMA
c8 00 10 8f 0f e7 e3 00 4208197.716 READ DMA
ca 00 10 8f ae 87 e4 00 4208197.716 WRITE DMA
ca 00 20 4f 19 67 e4 00 4208197.716 WRITE DMA
ca 00 10 bf 22 47 e4 00 4208197.716 WRITE DMA
Error 31 occurred at disk power-on lifetime: 26232 hours
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 08 0f 23 27 e2
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Timestamp Command/Feature_Name
-- -- -- -- -- -- -- -- --------- --------------------
c8 00 08 0f 23 27 e2 00 4208192.916 READ DMA
ca 00 02 68 09 39 e1 00 4208192.916 WRITE DMA
ca 00 08 4f ae 45 e1 00 4208192.916 WRITE DMA
ca 00 02 66 09 39 e1 00 4208192.916 WRITE DMA
ca 00 08 4f ae 45 e1 00 4208192.916 WRITE DMA
Error 30 occurred at disk power-on lifetime: 26232 hours
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 01 0e 23 27 e2
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Timestamp Command/Feature_Name
-- -- -- -- -- -- -- -- --------- --------------------
c8 00 08 07 23 27 e2 00 4208187.616 READ DMA
ca 00 02 60 09 39 e1 00 4208187.616 WRITE DMA
ca 00 08 4f ae 45 e1 00 4208187.616 WRITE DMA
ca 00 02 5e 09 39 e1 00 4208187.616 WRITE DMA
ca 00 08 4f ae 45 e1 00 4208187.616 WRITE DMA
Error 29 occurred at disk power-on lifetime: 26232 hours
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 01 0e 23 27 e2
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Timestamp Command/Feature_Name
-- -- -- -- -- -- -- -- --------- --------------------
c8 00 08 07 23 27 e2 00 4208182.816 READ DMA
c8 00 08 ff 22 27 e2 00 4208182.816 READ DMA
c8 00 08 f7 22 27 e2 00 4208182.816 READ DMA
c8 00 08 ef 22 27 e2 00 4208182.816 READ DMA
c8 00 08 e7 22 27 e2 00 4208182.816 READ DMA
Error 28 occurred at disk power-on lifetime: 26231 hours
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 08 0f 23 27 e2
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Timestamp Command/Feature_Name
-- -- -- -- -- -- -- -- --------- --------------------
c8 00 08 0f 23 27 e2 00 4204396.116 READ DMA
ca 00 02 df 03 39 e1 00 4204396.116 WRITE DMA
ca 00 08 47 ae 45 e1 00 4204396.116 WRITE DMA
ca 00 02 dd 03 39 e1 00 4204396.116 WRITE DMA
ca 00 08 47 ae 45 e1 00 4204396.116 WRITE DMA
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 25392 -
# 2 Extended offline Completed: read failure 20% 25338 0x0227230e
2.6.11.9-050512a #1 SMP Thu May 12 20:53:02 CEST 2005 i686 i686 i386 GNU/Linux