Tuesday, 29 March 2011

Disk Crash on Linux

Well, I thought that the introduction of the ext3 (and higher) file systems had stopped all of the problems of disk crashes and mangled filesystems on linux.
But, after a few years of reliable operation the root filesystem on my home server (my old Fujitsu Siemens laptop) seems to have mangled.
The symptoms were....not booting - you get the 'Starting up......' message during boot then the system appears to hang.  But if you listen carefully you hear that the disk is actually doing something.  So I left it for a few hours.....
When I came back to it there was a 'failed to mount /dev/sda1' type error message, with an option to skip mounting or manually fix it.
I went for manually fixing it, because not mounting the root filesystem will not get me very far.
This dropped me into a single user shell.
I ran fsck and it said that an Inode has illegal blocks.  I selected the option to clear them...and again said yes when it asked me again.
It then said it was restarting e2fsck from the beginning, and spent quite a few minutes checking....

Then at Pass 2 fsck said it had found a deleted or unused inode.  Again I said 'y' to the 'Clear?' question....
Then lots of offers to fix things (to the extent that I just held my finger on the 'y' key...)

Then fsck announced that it was complete and I should re-boot linux.   Re-ran fsck and it announced that /dev/sda1 was clean, so re-booted.....but booting is taking a suspiciously long time.....like it has been trying for 10 minutes and hasn't got past the boot up splash screen...I'm going to need a plan B...

Well, I don't know what the problem is.   Tried booting off a USB memory stick, and the disk checks ok and mounts, but the boot process just hangs.   I decided I had spent too long on this so it is currently installing Ubuntu 10.10 on the disk instead, so can't try any more diagnosis!
