[Fwd: Re: [Nottingham] Deciphering a kernel Ooops]

Martin martin at ml1.co.uk
Tue Oct 5 16:15:32 BST 2004


Jon Masters wrote:
> 
> This one dissappeared in to the ether (or mailman.lug.org.uk is uppity).

That's well confused my threads... (:-|)


> ------------------------------------------------------------------------
[...]
> Martin wrote:
> 
>> Just had this system slowly die just now. Checking the logs, the only 
>> anomalies are a few fd0 errors for a dodgy floppy disk from yesterday, 
>> and a kernel Ooops from earlier this morning.
> 
> 
> It oopsed in as a result of a call to init_dev within tty_open and I'm 
> trying to figure out what could have done this, but there have been some 
> bugs in the tty driver I believe recently. What we need to do next is 
> the following...

OK. That's answered my first question and the reason for my checking. 
Thanks.


>     1). Recompile the offending Mandrake kernel on said box with 
> debugging symbols and send me a copy of tty_io.o generated - that way I 
> don't have to guess what the assembly output by ksymoops relates to in 
> the original file and can find the exact line number of the fault.

I happen to be playing with two versions of gcc at the moment. Load is 
at 4+ (erk) with various stuff... I'll give it a try later this week. 
(On overload for now (:-((


>     2). Tell me whether you are using a serial console or have anything 
> else weird about the tty that you are on here (serial console?).

That depends on what you call a 'serial terminal'...

The original job was run by me many days earlier locally on "Konsole" as 
root via the script:

#!/bin/bash

# fah4 background task
# Folding at Home4 proteine folding simulations
fah4='/home/bgusr/.fah4/fahcli'

cd "${fah4%/*}"         # dirname
fah4="./${fah4##*/}"    # basename

# Ensure that only one instance is running
killall $fah4 >/dev/null 2>&1

su --command=$fah4 bgusr >>/var/log/fah4/fah4.log 2>&1 &


That Konsole usually stays up. On this occasion it may have timed out & 
closed down (no active job or the active job errored out).

(I have an old graphics tablet active on the com2 serial port, and a PS2 
mouse also on its PS2 port.)


>> So who can give a more human translation of the Ooops message?
> 
> I'm working on it. It's unlikely to be your disk dying here - the vfs 
> checking was just the kernel confirming that you're allowed to open the 
> tty that the current process is on, but I'm not sure exactly where it's 
> getting a NULL pointer. I'm hoping you're on a serial console.

Thanks for the checks.

Wow! Does this mean I've found a real bug in the kernel???
(:-O)


Cheers,
Martin

-- 
----------------
Martin Lomas
martin at ml1.co.uk
----------------



More information about the Nottingham mailing list