[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[cobalt-users] raq4r disk errors



Hi This raq4r has had 10 "episodes" in the last 5 months where the following happen: - massive outgoing traffic spike (5-7MB/s) shown in mrtg. Spikes last around 30 mins. Ive never caught it while its doing this so haven't been able to do a netstat to see what's going on. - at the same time /var/log/kernel reports many (between 100 and 300) errors like this: Dec 5 12:40:04 www kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error } Dec 5 12:40:04 www kernel: hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=1209450, sector=1209448 Dec 5 12:40:04 www kernel: end_request: I/O error, dev 03:01 (hda), sector 1209448
Dec  5 12:40:04 www kernel: raid1: md1: rescheduling block 604724
Dec 5 12:40:04 www kernel: raid1: md1: unrecoverable I/O read error for block 604724 - In the admin interface it says that one of the disks is faulty. The first time this happened we had the disk replaced. It carried on happening with a new disk. cat /proc/mdstat shows this:
md1 : active raid1 hdc1[2] hda1[0](F) 768000 blocks [2/1] [U_]
md3 : active raid1 hdc3[1] hda3[0] 205056 blocks [2/2] [UU]
md4 : active raid1 hdc4[2] hda4[0](F) 37947072 blocks [2/1] [U_]
md6 : active raid1 hdc6[1] hda6[0] 131456 blocks [2/2] [UU] - Half way through the "episode" sendmail restarts itself. But there is nothing unusual in the mail logs. Chkrootkit reports nothing strange. Does anyone have any ideas? Thanks, Julian