[dundee] Rescue of crippled, unmountable Linux systems

Lee Hughes toxicnaan at yahoo.co.uk
Tue Aug 14 16:44:15 BST 2007


you certainly been having fun a game, but like anything that happens for the first time in computing, you usual go away knowing more about it, sometimes, in windows you go away knowing more about it than the microsoft developers, which on the whole scares me ;-).

fsck on mounted partitions is bad.

fsck on non-mounted partitions are good.

you may understand now that some linuxes mount thier root file system 
either in ram, or read only on boot. This allows checking of the root file system too.
it's chicken and egg, you can't mount / and check it, if fsck and the kernel need
to be mounted. There are some elegant, and not so elegant solutions to this.

moving partitions at the byte level is bad

moving partitions as intact filesystems (tar/cpio etc) is good.

unless you rely on partition image dumps with stuff like ghost and those
disk imaging applications, then your asking for trouble. Epically going between
disk of different geometries and sizes..

http://sourceforge.net/projects/g4l/

is interesting, I have a bit success with.....

if fsck throws more than a few errors, it's highly likey that it cant repair
anything, any disconnected files, i.e. file that can't find thier parent inode (directory) are move to a folder called lost+found.

I still can't work out how you corrupted you file system in the first place.

bugs in filesystem? these are very few, the filesystem is probably one of the
most tested components in the system, as it's so critical to the whole running
of the o/s. Unless you got some crazy developer or gentoo system and
your running a prerelease or experimental filesystem (cool).

data recovery is a fine art, and if you image everything to being with, then
you can't go wrong. You should never work on live data unless you have to..

you can always plug in a extra cdrom drive, and dd the thing to it,
or in fact push it over the network with netcat.

I may do a presentation on data recovery techniques if the group feels it
would be useful.

in the meantime check this..

http://user-mode-linux.sourceforge.net/old/sdotm.html

it deals with lots of disasters and has solutions, as your working in a uml instance no harmdone... check it out..you'll learn lots.

if it an consolation, I've never managed to recover a NT filesystem with chkdsk either, chkdsk would always run for a bit and then crashout. leaving with no option but to wipe and reinstall.(and install linux)

saying that on the amiga, disk doctor would churn a floppy for a while, then rename the disk volume name to 'lazarus'. which means 'risen from the dead'

seldom did it work, and left you with a blank disk! nice!

http://en.wikipedia.org/wiki/Easter

Cheers,
Lee





Laters,
Lee



gordon dunlop <gordon at zubenel.freeserve.co.uk> wrote: I found a couple of methods in how to rescue systems. Firstly, never use 
fsck or e2fsck (similar to fsck but for ext2/3 systems with more 
options) on mounted partitions as they will ruin your data. In using the 
Knoppix 5.1 live CD, the partitions of your hard disk are shown on the 
desktop but are not mounted. I initially used e2fsck in a terminal (rot 
mode) with the command
# e2fsck -y /dev/sdb10
This started the process of correcting many errors but when it tried to 
reallocate 5 inode in the file system it couldn't and starting doing 
loops in trying to correct these errors. It also reported a corrupt root 
node, I stopped the error checks. I found out that some other people 
were having similar problems in a couple of Internet forums. One person 
use the tune2fs -s 1 command prior to using e2fsck , this turns the 
sparse super feature on which saves space on big files systems, and 
restored his system to an operable condition. Another user using logical 
volume management of his disks in Fedora used the debugfs command prior 
to using e2fsck where he could resize the inodes and restore his system. 
I also found out that using a setting of 0 inode size would break the 
main inode skeletal system and dump the directories to lost+found. I now 
had Fedora 7 up and running so the only that thing that I required was 
some data so I opted for 0 inode option.
# debugfs -w /dev/sdb10 -R "features ^resize_node"
# e2fsck -y /dev/sdb10
The e2fsck program started and asked if I wanted to change the inode 
size, which was then set to 0. The root node was removed and all errors 
corrected and the programme finished with the System Modified message. I 
then mounted the Fedora 6 partition /dev/sdb10, it gave me the directory 
lost+found where all the separate Fedora 6 directories, including my 
/home/Gordon directory. All these directories were numbered (not named) 
so I had the go inside every one to determine its contents. All I had 
left to do was transfer the required data from lost+found into my Fedora 
7 partition. My basic findings were that the corruption in file systems 
can occur due to hardware errors, media errors, bugs in file systems and 
certain types of manipulation in partitions. This type of problem 
occurred to one person when he shrank 2 partitions, he did not mention 
whether there was any data migration when doing this operation. Also, as 
disks are getting bigger, resulting in larger and more complex data 
systems, the chances of the file system increases. In my case I think 
that my Windows Xp VM did contribute to the situation due to its huge 
size (20GB) where a problem occurred in trying to reallocate blocks 
within the file system. You will find that VM's are probably the biggest 
files within a system (3~5 GB for nomal Fedora or SuSE VM using Xen). It 
is probably better to move them to another partition prior to data 
migration in a partition (If you have no option but to move Operating 
System). One of the reasons why the Xandros 4 partition moved with no 
problems compared to Fedora 6 was the difference in size and complexity 
of the file systems. Xandros had only 360 superblocks compare with 
approx 1600 in Fedora . The inode size and system in Fedora was also far 
more complex. There was no data in Xandros, just a basic OS as I just 
use for its file manager in super-user mode (the best that I have used) 
to move data and OS's between partitions (using the copy and paste 
method). The Knoppix live cd has an excellent set of tools to use, there 
is another Knoppix CD called Knoppix S-T-D where there are all sorts of 
forensic, network and other tools (people doing the ethical hacking 
course at Abertay would find this CD interesting. Finally, I would not 
like to go through the above experience again but in some ways I am glad 
it happened as it gave me a steep learning curve in file systems and 
system rescue. I hope this post is helpful to other people who may find 
themselves in a similar situation in the future.

Gordon


_______________________________________________
dundee GNU/Linux Users Group mailing list
dundee at lists.lug.org.uk  http://dundee.lug.org.uk
https://mailman.lug.org.uk/mailman/listinfo/dundee
Chat on IRC, #tlug on dundee.lug.org.uk


       
---------------------------------
 Yahoo! Answers - Get better answers from someone who knows. Tryit now.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.lug.org.uk/pipermail/dundee/attachments/20070814/e85ed02f/attachment-0001.html


More information about the dundee mailing list