[Fwd: Re: [Nottingham] Deciphering a kernel Ooops]
Martin
martin at ml1.co.uk
Tue Oct 5 16:15:32 BST 2004
Jon Masters wrote:
>
> This one dissappeared in to the ether (or mailman.lug.org.uk is uppity).
That's well confused my threads... (:-|)
> ------------------------------------------------------------------------
[...]
> Martin wrote:
>
>> Just had this system slowly die just now. Checking the logs, the only
>> anomalies are a few fd0 errors for a dodgy floppy disk from yesterday,
>> and a kernel Ooops from earlier this morning.
>
>
> It oopsed in as a result of a call to init_dev within tty_open and I'm
> trying to figure out what could have done this, but there have been some
> bugs in the tty driver I believe recently. What we need to do next is
> the following...
OK. That's answered my first question and the reason for my checking.
Thanks.
> 1). Recompile the offending Mandrake kernel on said box with
> debugging symbols and send me a copy of tty_io.o generated - that way I
> don't have to guess what the assembly output by ksymoops relates to in
> the original file and can find the exact line number of the fault.
I happen to be playing with two versions of gcc at the moment. Load is
at 4+ (erk) with various stuff... I'll give it a try later this week.
(On overload for now (:-((
> 2). Tell me whether you are using a serial console or have anything
> else weird about the tty that you are on here (serial console?).
That depends on what you call a 'serial terminal'...
The original job was run by me many days earlier locally on "Konsole" as
root via the script:
#!/bin/bash
# fah4 background task
# Folding at Home4 proteine folding simulations
fah4='/home/bgusr/.fah4/fahcli'
cd "${fah4%/*}" # dirname
fah4="./${fah4##*/}" # basename
# Ensure that only one instance is running
killall $fah4 >/dev/null 2>&1
su --command=$fah4 bgusr >>/var/log/fah4/fah4.log 2>&1 &
That Konsole usually stays up. On this occasion it may have timed out &
closed down (no active job or the active job errored out).
(I have an old graphics tablet active on the com2 serial port, and a PS2
mouse also on its PS2 port.)
>> So who can give a more human translation of the Ooops message?
>
> I'm working on it. It's unlikely to be your disk dying here - the vfs
> checking was just the kernel confirming that you're allowed to open the
> tty that the current process is on, but I'm not sure exactly where it's
> getting a NULL pointer. I'm hoping you're on a serial console.
Thanks for the checks.
Wow! Does this mean I've found a real bug in the kernel???
(:-O)
Cheers,
Martin
--
----------------
Martin Lomas
martin at ml1.co.uk
----------------
More information about the Nottingham
mailing list