[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[cobalt-users] raq4r disk errors
- Subject: [cobalt-users] raq4r disk errors
- From: "lists" <lists@xxxxxxxxxx>
- Date: Fri Dec 6 01:50:15 2002
- List-id: Mailing list for users to share thoughts on Sun Cobalt products. <cobalt-users.list.cobalt.com>
Hi
This raq4r has had 10 "episodes" in the last 5 months where the following
happen:
- massive outgoing traffic spike (5-7MB/s) shown in mrtg. Spikes last around
30 mins. Ive never caught it while its doing this so haven't been able to do
a netstat to see what's going on.
- at the same time /var/log/kernel reports many (between 100 and 300) errors
like this:
Dec 5 12:40:04 www kernel: hda: dma_intr: status=0x51 { DriveReady
SeekComplete Error }
Dec 5 12:40:04 www kernel: hda: dma_intr: error=0x40 { UncorrectableError
}, LBAsect=1209450, sector=1209448
Dec 5 12:40:04 www kernel: end_request: I/O error, dev 03:01 (hda), sector
1209448
Dec 5 12:40:04 www kernel: raid1: md1: rescheduling block 604724
Dec 5 12:40:04 www kernel: raid1: md1: unrecoverable I/O read error for
block 604724
- In the admin interface it says that one of the disks is faulty. The first
time this happened we had the disk replaced. It carried on happening with a
new disk. cat /proc/mdstat shows this:
md1 : active raid1 hdc1[2] hda1[0](F) 768000 blocks [2/1] [U_]
md3 : active raid1 hdc3[1] hda3[0] 205056 blocks [2/2] [UU]
md4 : active raid1 hdc4[2] hda4[0](F) 37947072 blocks [2/1] [U_]
md6 : active raid1 hdc6[1] hda6[0] 131456 blocks [2/2] [UU]
- Half way through the "episode" sendmail restarts itself. But there is
nothing unusual in the mail logs. Chkrootkit reports nothing strange.
Does anyone have any ideas?
Thanks, Julian