v10.9, 03 May 2003
Abstract
This document discusses methods to recover from Linux system failures. Various reasons for linux system failures can be - LILO is destroyed, or linux fails to boot, or Master Boot Record (MBR) is damaged or linux fails to boot when another operating system like Windows NT is installed which erases LILO or MBR.
Table of Contents
(The latest version of this document is at "http://www.milkywaygalaxy.freeservers.com" . You may want to check there for changes).
You cannot avoid accidents and if it happens to linux systems then it may damage the master boot record (MBR) or LILO (Linux boot Loader). There may be cases where linux will not boot due to hard disk failures. The LILO may also fail if you accidentally re-partition the hard disk or you install another additional operating system like Windows 98/NT on the linux computer.
This document gives you some ideas, tips and quick guide to recover fast without wading through hundreds of pages of documentation on LILO or Linux.
To recover any Windows 95/NT/2000, OS/2, BeOS or Linux box you may need the tiny linux which fits on a single floppy disk. See the list of tiny floppy linux given below -
Linuxcare Bootable Toolbox "http://lbt.linuxcare.com"
LNX-BBC "http://lnxbbc.org"
Tom's Root and Boot "the most GNU/Linux on one floppy disk" "http://www.toms.net/rb"
"Innominate-Linux-Rescue-System" - a CD-based rescue system "http://innominate.org/projects/rescueCD"
Irish Linux Users Group (ILUG) BBC: A bootable CD ISO "ftp://ftp.blackstar.co.uk/pub/ILUG/"
Lubbock: Meant to be the "Tom's Root Boot" of CDs "http://lubbock.sourceforge.net"
Repairlix: a networked Linux distribution/bootable system intended to fit in 12MB of media. "http://repairlix.sourceforge.net"
SuperRescue: a Red Hat-based bootable CD "http://www.kernel.org/pub/dist/superrescue"
Timo's Rescue CD Set: an easy way of generating el Torito Boot Cd's "http://rescuecd.sourceforge.net"
It is a good idea to backup the important system files like /etc/fstab, /etc/lilo.conf after you login using Tomsrtbt floppy or RedHat Linux CDROM (Rescue option) in following sections. This can be very handy during crash situation or something happens to system files.
bash# cp /etc/fstab /etc/fstab.orig bash# cp /etc/lilo.conf /etc/lilo.conf.orig bash# cp /etc/hosts /etc/hosts.orig bash# cp /etc/hosts.allow /etc/hosts.allow.orig bash# cp /etc/hosts.deny /etc/hosts.deny.orig bash# cp /etc/inetd.conf /etc/inetd.conf.orig bash# cp /etc/inittab /etc/inittab.orig bash# cp /etc/networks /etc/networks.orig
Most of the distributions like RedHat, SUSE, Debian provide CDROM which have "Rescue" option. For this, you have should set the BIOS of your computer to boot first from IDE CDROM drive. Usually you set the BIOS (using F8 key during boot) to boot first from CDROM drive, second from Floppy drive and third from hard disk. Load the Linux cdrom into the CD drive and reboot the system. The Linux distribution will load and at the prompt select "Rescue Operation". In the resuce operation mount the hard disks and try to repair.
# chroot /mnt/SYSIMAGE # df
After doing chroot, the system will look as if you had booted the system from hard disk. You can see all the partitions and you can repair or recover the files.
Follow these steps to recover from LILO or system failures.
SCENE 1: If your system does not boot -
Get the tomsrtbt floppy "http://www.toms.net/rb" or MuLinux floppy (see also Section 1.1, “ Tiny Floppy Linux ” ). Boot with tomsrtbt floppy Use fdisk to find the partitions. Try to recognise the root and boot partition. Watch out, you may be having the /boot files on the root partition itself.
The Linux's root partition has the following directories bin , boot , etc , usr .
And the Linux's boot partition has these directories: vmlinuz , boot.b , chain.b , map .
To find out root partition do this :
bash# fdisk /dev/hda Command (m for help): m (Gives you help on commands) Command (m for help): p (Gives you list of partitons) Command (m for help): q bash# mkdir /test bash# mount /dev/hda1 /test bash# ls /test You should see root-partition list like this - bin fd lib mnt proc sbin usr boot dev etc home lost+found opt root tmp var
If this is not a root partition, then try the next partition /dev/hda2. Keep trying hda3, hda4, hda5, etc.. untill you find the root partition. If you do not find root partition in hda device then repeat the above steps for other hard disk devices like hdb , hdc , hdd etc..
Next, you should find the /boot, /usr and /var partitions. The disk locations of these partitions are needed to create the new lilo configuration.
In my case the root partition is /dev/hda4 which is used in the examples below:
bash# mkdir /rootpartition bash# mount /dev/hda4 /rootpartition bash# cat /rootpartition/etc/fstab Read the output of fstab and mount partitions as per fstab file, see below - bash# mount /dev/hda5 /rootpartition/boot bash# mount /dev/hda6 /rootpartition/usr bash# mount /dev/hda7 /rootpartition/var bash# mount /dev/hda8 /rootpartition/opt bash# mount /dev/hda9 /rootpartition/root bash# mount /dev/hda10 /rootpartition/home
In my case, as per fstab file hda5 was boot , hda6 was usr , hda7 was var , hda8 was opt , hda9 was root , hda10 was home and hda11 was windows95 (FAT16 partition).
Edit /etc/fstab (not /rootpartition/etc/fstab) and put (sample code given here) -
/dev/hda4 /rootpartition ext2 defaults 1 1 /dev/hda5 /rootpartition/boot ext2 defaults 1 1 /dev/hda6 /rootpartition/usr ext2 defaults 1 1 /dev/hda7 /rootpartition/var ext2 defaults 1 1 /dev/hda8 /rootpartition/opt ext2 defaults 1 1 /dev/hda9 /rootpartition/root ext2 defaults 1 1 /dev/hda10 /rootpartition/home ext2 defaults 1 1 /dev/hda11 /rootpartition/win95part vfat defaults 1 1 On my computer hda4 contains the linux root partition, hda5 had boot partition and hda11 has windows 95 vfat system. bash# mkdir /rootpartition/win95part bash# mount /rootpartition/win95part And repair the problem partitions using fsck or e2fsck commands. bash# man fsck bash# man e2fsck
SCENE 2: If LILO is not working..
Follow scene 1 above, if that fails then follow these steps. After executing steps in scene 1 above, you should have already mounted /rootpartition and have created /etc/fstab file.
Note: It is very important to note how chroot command works below. The /sbin/lilo file which chroot uses is actually located in /rootpartition/sbin/lilo and NOT in /sbin!! Hence, do not get confused.
bash# mount -a bash# chroot /rootpartition /sbin/lilo -q bash# man chroot bash# chroot /rootpartition /sbin/lilo
Note: New users of chroot will be confused. If chroot command complains that it cannot find /boot/map file then it actually means it that it cannot find /rootpartition/boot/map. Because you gave /rootpartition as the first argument to chroot and all references are with respect to /rootpartition.
Alternatively, you can directly use /sbin/lilo instead of chroot. The -r option of lilo actually does chroot. It is very strongly recommended that you use chroot, instead of lilo -r, as it is more convenient and can catch errors more easily.
bash# man lilo bash# /sbin/lilo -r /rootpartition
SCENE 3: If LILO is not working..
If scene 1 and 2 failes, then if you made the boot disk with 'mkbootdisk' (during install or by using 'man mkbootdisk'), boot with it and repair your partitions. The mkbootdisk is in mkbootdisk*.rpm package, you must install this. Or get boot disks for Linux/NT/Windows/DOS/Mac are at "http://www.bootdisk.com" Other option is - get a hold of installation Linux-CDROM. Just about every Linux distribution provides a image of a rescue disk on their CD. Under Linux use "dd if=/cdrom/disks/rescue of=/dev/fd0" to create a rescue floppy disk. Under DOS use rawrite.exe (included on Linux CD) and then do "rawrite image-name a:".
SCENE 4: If 1, 2 and 3 above fails and you do not have boot disk
If you have another computer running linux, then login as root and do -
Note: If you compile your own kernel as a bzImage (for instance, bzImage-2.4.4), then you should create a hard link to vmlinuz-2.4.4 as follows (note the the z in name vmlinuz and it is not vmlinux). If you do not do this then mkbootdisk command may fail.
bash# cd /boot bash# ls -l vmlinuz* bash# ln /boot/bzImage-2.4.4 /boot/vmlinuz-2.4.4
Now that you have bzImage and vmlinuz, give these commands -
bash$ man mkbootdisk bash# cp /etc/lilo.conf /etc/lilo-original.conf
Edit the /etc/lilo.conf and put the root partition name as you obtained in 'scene 1' above and insert a blank floppy and give -
bash$ mkbootdisk --device /dev/fd0 2.2.12-20
The mkbootdisk is in mkbootdisk*.rpm package, you must install this. Make sure you move the /etc/lilo-original.conf back to /etc/lilo.conf!! And then take this floppy and goto scene 3
SCENE 5: This is the worst scenerio and hopefully you will never come to this stage. Scenes from 1 to 4 will take care of majority of cases. But just in case, all the above scenes 1, 2, 3 and 4 fail then -
Step 1: Boot tomsrtbt (see Section 1.1, “ Tiny Floppy Linux ” ) and mount the partitions and backup the root partition to another partition having disk space with comamnds -
Edit /etc/fstab and put (sample code given here, you may have to change as per your disk layout) - /dev/hda4 /rootpartition ext2 defaults 1 1 /dev/hda11 /b1 vfat defaults 1 1 bash$ mkdir /rootpartition; mount /rootpartition bash$ mkdir /b1; mount /b1 bash$ cd / bash$ df And see that there is enough disk space in /b1 to tar up the root partition bash$ tar cvf /b1/root-hda4.tar /rootpartition
Step 2: Insert Linux cdrom, reboot and install the redhat linux on /dev/hda4 (but DO NOT install any extra packages, you just need to install only the root, boot systems and LILO manager that is, a very bare minimum). This will also install the LILO on hard disk. Boot linux now and login as root and do -
bash$ man mkbootdisk bash# cp /etc/lilo.conf /etc/lilo-original.conf
Note: You MUST remember to copy back lilo-original.conf to lilo.conf!! Edit the /etc/lilo.conf and put the root partition name as you obtained in 'scene 1' above and insert a blank floppy and give -
bash$ mkbootdisk --device /dev/fd0 2.2.12-20 bash# cp /etc/lilo-original.conf /etc/lilo.conf
Test this boot floppy to see that this works and then restore back the all the files which you backedup using tar on /b1/root-hda4.tar as in step 1 above.
You should take the following pre-cautionary measures to tackle the problems in future.
You MUST make emergency boot disk from time to time and whenever you make changes to the partition. Insert a blank floppy and do this -
bash$ man mkbootdisk The mkbootdisk is in mkbootdisk*.rpm package, you must install this. bash$ mkbootdisk --help bash$ mkbootdisk --device /dev/fd0 2.2.12-20
You MUST backup the partition tables setup to a floppy and to a hard disk. You should also print this out and paste it on the computer box.
bash$ su - root bash# man fdisk bash# fdisk -l /dev/sda > partition_table_backup.txt
Very helpful if you need to repartition the hard disk. From the printout, you would know where your partition starts. During recovery, after repatitioning and formating you can restore data from the backup.
You must keep the tomsrtbt boot floppy handy. Visit "http://www.toms.net/rb" (see also Section 1.1, “ Tiny Floppy Linux ” )
You must keep the Yard rescue and boot floppy disk handy. Visit "http://www.linuxlots.com/~fawcett/yard"
Backup /root and /boot directories. Boot the Tomsrtbt floppy (see also Section 1.1, “ Tiny Floppy Linux ” ) and then
bash# vi /etc/fstab And put these lines - /dev/hda1 /a1 vfat defaults 1 1 /dev/hdb1 /b1 vfat defaults 1 1 In my case hda1 had the linux root partition '/' bash# cd / bash# tar cvf /b1/linux-root-partition-hda1.tar a1 bash# tar cvf /b1/linux-boot-partition-hda1.tar a1/boot
You can replace the boot sector with the DOS boot loader by issuing the DOS command at MS DOS prompt:
FDISK /MBR
where MBR stands for "Master Boot Record".
See also LILO documentation on linux at /usr/doc/lilo* for other methods of uninstalling the LILO. And see also 'man lilo'.
If the disk partition is corrupted then use these techniques given below:
The following tools are available:
rescuept by Andries Brouwer which is included in the non-installed part of util-linux recuept-download
Gordon Chaffees fixdisktable "http://bmrc.berkeley.edu/people/chaffee/fat32.html" and mirror site
Christophe Greniers Testdisk "http://www.cgsecurity.org/index.html?testdisk.html" and mirror site
GPart from "http://www.stud.uni-hannover.de/user/76201/gpart" .
See also "http://www.tldp.org/HOWTO/mini/Partition-Rescue/index.html"
If you're trying to rescue a system with a corrupted partition table on the main (boot) disk which is unable to boot you have two options:
the easiest way is to find a working system where you can add your disk. In case the other system cannot report the disk's correct geometry, note down its geometry prior to moving to the working system and tell gpart about it (use the "-C c,h,s" option).
download the gpart binary above, rename it to "gpart", store it on a floppy disk, print out the manual page above, and start your system by using your prefered boot disk.
After booting, look if your hard disk has been detected by your system by entering a shell and typing "dmesg". Under e.g. Linux you should look out for lines like "hdc: ST320430A, 19569MB w/512kB Cache, CHS=39761/16/63". If you have booted with a rescue disk mount the floppy disk with gpart on, and cd to the mount point.
# To read the online manual page on gpart do : bash$ man gpart bash$ gpart --help gpart: invalid option -- - Usage: gpart [options] device Options: [-b <backup MBR>][-C c,h,s][-c][-d][-E][-e][-f][-g][-h][-i] [-K <last sector>][-k <# of sectors>][-L][-l <log file>] [-n <increment>][-q][-s <sector-size>][-t <module-name>] [-V][-v][-W <device>][-w <module-name,weight>] gpart v0.1h (c) 1999-2001 Michail Brzitwa <michail@brzitwa.de>. Guess PC-type hard disk partitions.
Now run "gpart /dev/[lt ]your disk[gt ]", e.g. "gpart /dev/hdc". Without any options, gpart performs a standard scan, and merely looks if it can guess a consistent primary partition table. A typical positive output looks like:
Begin scan... Possible partition(DOS FAT), size(3999mb), offset(0mb) Possible extended partition at offset(4000mb) Possible partition(Windows NTFS), size(3999mb), offset(4000mb) Possible partition(Linux ext2), size(3072mb), offset(8000mb) Possible partition(Linux ext2), size(3072mb), offset(11072mb) Possible partition(Linux ext2), size(3072mb), offset(14144mb) Possible partition(Linux ext2), size(2353mb), offset(17216mb) End scan. Checking partitions... Partition(DOS or Windows 95 with 32 bit FAT): primary Partition(OS/2 HPFS, NTFS, QNX or Advanced UNIX): logical Partition(Linux ext2 filesystem): logical Partition(Linux ext2 filesystem): logical Partition(Linux ext2 filesystem): primary Partition(Linux ext2 filesystem): primary Ok. Guessed primary partition table: Primary partition(1) type: 011(0x0B)(DOS or Windows 95 with 32 bit FAT) size: 3999mb #s(8191953) s(63-8192015) chs: (0/1/1)-(1023/15/63)d (0/1/1)-(8126/15/63)r Primary partition(2) type: 005(0x05)(Extended DOS) size: 10144mb #s(20775888) s(8192016-28967903) chs: (1023/15/63)-(1023/15/63)d (8127/0/1)-(28737/15/63)r Primary partition(3) type: 131(0x83)(Linux ext2 filesystem) size: 3072mb #s(6291456) s(28967904-35259359) chs: (1023/15/63)-(1023/15/63)d (28738/0/1)-(34979/8/24)r Primary partition(4) type: 131(0x83)(Linux ext2 filesystem) size: 2353mb #s(4819248) s(35259840-40079087) chs: (1023/15/63)-(1023/15/63)d (34980/0/1)-(39760/15/63)r
Now if after the check-phase it says Ok, you should check the proposed partition table very carefully. After that you may write back the guessed table by calling "gpart -W /dev/hdc /dev/hdc" (exchange /dev/hdc with your disk device). When gpart has successfully written the new primary partition table, cross your fingers and reboot.
If gpart says it found inconsistencies, you are a bit on your own. What you can do is to fiddle with gparts numerous options. For example, to scan on sector boundaries instead of head boundaries, give it the "-n s" option. Normally gpart skips the disk space a possible partition seems to occupy, to really scan the whole disk, add the "-f" option. Read the man page and improvise.
The gpart tries to guess which partitions are on a hard disk. If the primary partition table has been lost, overwritten or destroyed the partitions still exist on the disk but the operating system cannot access them.
gpart ignores the primary partition table and scans the disk (or disk image, file) sector after sector for several filesystem/partition types. It does so by "asking" filesystem recognition modules if they think a given sequence of sectors resembles the beginning of a filesystem or partition type. Currently the following filesystem types are known to gpart (listed by module names) : beos, bsddl, ext2, fat, hmlvm, lswap, minix, ntfs, qnx4, rfs, s86dl, xfs.
Visit following locators which are related to LILO, Rescue Linux, crash recovery -
Mini Lilo HOWTO at "http://www.tldp.org/HOWTO/mini/LILO.html"
Bootdisk-HOWTO at "http://www.tld.org/LDP/HOWTO/Bootdisk-HOWTO/index.html"
Pre-made boot disks at "http://www.tldp.org/HOWTO/Bootdisk-HOWTO"
Boot disks for Linux/NT/Windows/DOS/Mac at "http://www.bootdisk.com"
Tomsrtbt boot floppy disk "http://www.toms.net/rb" and (see also Section 1.1, “ Tiny Floppy Linux ” )
Yard rescue and boot floppy disk "http://www.linuxlots.com/~fawcett/yard"
BootPrompt-HOWTO at "http://www.tldp.org/HOWTO/BootPrompt-HOWTO.html"
Multiboot with LILO mini HOWTO at "http://www.tldp.org/HOWTO/mini/Multiboot-with-LILO.html"
Linux+WinNT mini HOWTO at "http://www.tldp.org/HOWTO/mini/Linux+WinNT.html"
Linux goodies main site "http://www.milkywaygalaxy.freeservers.com" Mirror sites are at - "http://aldev0.webjump.com" , angelfire , geocities , virtualave , 50megs , theglobe , NBCi , Terrashare , Fortunecity , Freewebsites , Tripod , Spree , Escalix , Httpcity , Freeservers .
Vim color text editor for C++, C "http://www.tldp.org/LDP/HOWTO/Vim-HOWTO.html"
This document is published in 14 different formats namely - DVI, Postscript, Latex, Adobe Acrobat PDF, LyX, GNU-info, HTML, RTF(Rich Text Format), Plain-text, Unix man pages, single HTML file, SGML (Linuxdoc format), SGML (Docbook format), MS WinHelp format.
This howto document is located at -
"http://www.tldp.org" and click on HOWTOs and search for howto document name using CTRL+f or ALT+f within the web-browser.
You can also find this document at the following mirrors sites -
Other mirror sites near you (network-address-wise) can be found at "http://www.tldp.org/mirrors.html" select a site and go to directory /LDP/HOWTO/xxxxx-HOWTO.html
You can get this HOWTO document as a single file tar ball in HTML, DVI, Postscript or SGML formats from - "ftp://www.tldp.org/pub/Linux/docs/HOWTO/other-formats/" and "http://www.tldp.org/docs.html#howto"
Plain text format is in: "ftp://www.tldp.org/pub/Linux/docs/HOWTO" and "http://www.tldp.org/docs.html#howto"
Single HTML file format is in: "http://www.tldp.org/docs.html#howto"
Single HTML file can be created with command (see man sgml2html) - sgml2html -split 0 xxxxhowto.sgml
Translations to other languages like French, German, Spanish, Chinese, Japanese are in "ftp://www.tldp.org/pub/Linux/docs/HOWTO" and "http://www.tldp.org/docs.html#howto" Any help from you to translate to other languages is welcome.
The document is written using a tool called "SGML-Tools" which can be got from - "http://www.sgmltools.org" Compiling the source you will get the following commands like
sgml2html xxxxhowto.sgml (to generate html file)
sgml2html -split 0 xxxxhowto.sgml (to generate a single page html file)
sgml2rtf xxxxhowto.sgml (to generate RTF file)
sgml2latex xxxxhowto.sgml (to generate latex file)
PDF file can be generated from postscript file using either acrobat distill or Ghostscript . And postscript file is generated from DVI which in turn is generated from LaTex file. You can download distill software from "http://www.adobe.com" . Given below is a sample session:
bash$ man sgml2latex bash$ sgml2latex filename.sgml bash$ man dvips bash$ dvips -o filename.ps filename.dvi bash$ distill filename.ps bash$ man ghostscript bash$ man ps2pdf bash$ ps2pdf input.ps output.pdf bash$ acroread output.pdf &
Or you can use Ghostscript command ps2pdf . ps2pdf is a work-alike for nearly all the functionality of Adobe's Acrobat Distiller product: it converts PostScript files to Portable Document Format (PDF) files. ps2pdf is implemented as a very small command script (batch file) that invokes Ghostscript, selecting a special "output device" called pdfwrite . In order to use ps2pdf, the pdfwrite device must be included in the makefile when Ghostscript was compiled; see the documentation on building Ghostscript for details.
This document is written in linuxdoc SGML format. The Docbook SGML format supercedes the linuxdoc format and has lot more features than linuxdoc. The linuxdoc is very simple and is easy to use. To convert linuxdoc SGML file to Docbook SGML use the program ld2db.sh and some perl scripts. The ld2db output is not 100[percnt] clean and you need to use the clean[lowbar]ld2db.pl perl script. You may need to manually correct few lines in the document.
Download ld2db program from "http://www.dcs.gla.ac.uk/~rrt/docbook.html" or from Milkyway Galaxy site
Download the cleanup[lowbar]ld2db.pl perl script from Milkyway Galaxy site
The ld2db.sh is not 100[percnt] clean, you will get lots of errors when you run
bash$ ld2db.sh file-linuxdoc.sgml db.sgml bash$ cleanup.pl db.sgml > db_clean.sgml bash$ gvim db_clean.sgml bash$ docbook2html db.sgml
And you may have to manually edit some of the minor errors after running the perl script. For e.g. you may need to put closing tag < /Para> for each < Listitem>
You can convert the SGML howto document to Microsoft Windows Help file, first convert the sgml to html using:
bash$ sgml2html xxxxhowto.sgml (to generate html file) bash$ sgml2html -split 0 xxxxhowto.sgml (to generate a single page html file)
Then use the tool HtmlToHlp . You can also use sgml2rtf and then use the RTF files for generating winhelp files.
In order to view the document in dvi format, use the xdvi program. The xdvi program is located in tetex-xdvi*.rpm package in Redhat Linux which can be located through ControlPanel [verbar] Applications [verbar] Publishing [verbar] TeX menu buttons. To read dvi document give the command -
xdvi -geometry 80x90 howto.dvi man xdvi
And resize the window with mouse. To navigate use Arrow keys, Page Up, Page Down keys, also you can use 'f', 'd', 'u', 'c', 'l', 'r', 'p', 'n' letter keys to move up, down, center, next page, previous page etc. To turn off expert menu press 'x'.
You can read postscript file using the program 'gv' (ghostview) or 'ghostscript'. The ghostscript program is in ghostscript*.rpm package and gv program is in gv*.rpm package in Redhat Linux which can be located through ControlPanel [verbar] Applications [verbar] Graphics menu buttons. The gv program is much more user friendly than ghostscript. Also ghostscript and gv are available on other platforms like OS/2, Windows 95 and NT, you view this document even on those platforms.
Get ghostscript for Windows 95, OS/2, and for all OSes from "http://www.cs.wisc.edu/~ghost"
To read postscript document give the command -
gv howto.ps ghostscript howto.ps
You can read HTML format document using Netscape Navigator, Microsoft Internet explorer, Redhat Baron Web browser or any of the 10 other web browsers.
You can read the latex, LyX output using LyX a X-Windows front end to latex.
Copyright policy is GNU/GPL as per LDP (Linux Documentation project). LDP is a GNU/GPL project. Additional requests are that you retain the author's name, email address and this copyright notice on all the copies. If you make any changes or additions to this document then you please intimate all the authors of this document. Brand names mentioned in this document are property of their respective owners.