[SWLUG] postfix errors, help please

Fri Jul 21 22:24:29 UTC 2006

On Friday 21 Jul 2006 12:28, Dave Cridland wrote:
> To get a write error, I *think* it has to get an error from write().
> This can be disk full, or it can be a disk error - like a dodgy
> sector as the disk dies. If that's the case, you should find the
> kernel crying into its logs.
>
> Given that the machine later died, and that subsequent mail activity
> happened, and the size of the email wasn't huge, my guess would be
> that the disk needs a careful checking.
well a sample of syslog from last night is:
Jul 20 08:10:36 watson kernel: Unable to handle kernel NULL pointer 
dereference at virtual address 00000020
Jul 20 08:10:36 watson kernel:  printing eip:
Jul 20 08:10:36 watson kernel: c01fae5a
Jul 20 08:10:36 watson kernel: *pde = 00000000
Jul 20 08:10:36 watson kernel: Oops: 0000 [#1]
Jul 20 08:10:36 watson kernel: Modules linked in: aes floppy nfsd exportfs 
lockd nfs_acl sunrpc md5 ipv6 tulip af_packet ide_cd cryptoloop loop via_agp 
agpg
art evdev ext3 jbd
Jul 20 08:10:36 watson kernel: CPU:    0
Jul 20 08:10:36 watson kernel: EIP:    0060:[get_index+26/80]    Not tainted 
VLI
Jul 20 08:10:36 watson kernel: EIP:    0060:[<c01fae5a>]    Not tainted VLI
Jul 20 08:10:36 watson kernel: EFLAGS: 00010297   (2.6.12-22mdk-i586-up-1GB)
Jul 20 08:10:36 watson kernel: EIP is at get_index+0x1a/0x50
Jul 20 08:10:36 watson kernel: eax: c9509580   ebx: cae67ee8   ecx: cae67eec   
edx: ffffffd8
Jul 20 08:10:36 watson kernel: esi: c9509580   edi: cae67eec   ebp: cae67ecc   
esp: cae67ec8
Jul 20 08:10:36 watson kernel: ds: 007b   es: 007b   ss: 0068
Jul 20 08:10:36 watson kernel: Process curl (pid: 3466, threadinfo=cae66000 
task=c0e105c0)
Jul 20 08:10:36 watson kernel: Stack: cb3f0b44 cae67efc c01fb1cb c9509580 
00000000 cae67eec cae67ee8 2005f00e
Jul 20 08:10:36 watson kernel:        c4edadc0 20000060 c83d6e20 ca3f0b1c 
c4edadc0 cae67f18 c019230d c9509580
Jul 20 08:10:36 watson kernel:        ca3f0b44 c950956c c3730074 ca3f0b1c 
cae67f38 c0194309 ca3f0b1c 0000006f
Jul 20 08:10:36 watson kernel: Call Trace:
Jul 20 08:10:36 watson kernel:  [show_stack+134/208] show_stack+0x86/0xd0
Jul 20 08:10:36 watson kernel:  [<c01042a6>] show_stack+0x86/0xd0
Jul 20 08:10:36 watson kernel:  [show_registers+306/464] 
show_registers+0x132/0x1d0
Jul 20 08:10:36 watson kernel:  [<c0104442>] show_registers+0x132/0x1d0
Jul 20 08:10:36 watson kernel:  [die+170/304] die+0xaa/0x130
Jul 20 08:10:36 watson kernel:  [<c010461a>] die+0xaa/0x130
Jul 20 08:10:36 watson kernel:  [do_page_fault+538/1781] 
do_page_fault+0x21a/0x6f5
Jul 20 08:10:36 watson kernel:  [<c011714a>] do_page_fault+0x21a/0x6f5
Jul 20 08:10:36 watson kernel:  [error_code+79/96] error_code+0x4f/0x60
Jul 20 08:10:36 watson kernel:  [<c0103ebf>] error_code+0x4f/0x60
Jul 20 08:10:36 watson kernel:  [prio_tree_remove+59/176] 
prio_tree_remove+0x3b/0xb0
Jul 20 08:10:36 watson kernel:  [<c01fb1cb>] prio_tree_remove+0x3b/0xb0
Jul 20 08:10:36 watson kernel:  [remove_vm_struct+29/96] 
remove_vm_struct+0x1d/0x60
Jul 20 08:10:36 watson kernel:  [<c019230d>] remove_vm_struct+0x1d/0x60
Jul 20 08:10:36 watson kernel:  [exit_mmap+233/272] exit_mmap+0xe9/0x110
Jul 20 08:10:36 watson kernel:  [<c0194309>] exit_mmap+0xe9/0x110
Jul 20 08:10:36 watson kernel:  [mmput+35/112] mmput+0x23/0x70
Jul 20 08:10:36 watson kernel:  [<c015fc03>] mmput+0x23/0x70
Jul 20 08:10:36 watson kernel:  [do_exit+201/880] do_exit+0xc9/0x370
Jul 20 08:10:36 watson kernel:  [<c0164099>] do_exit+0xc9/0x370
Jul 20 08:10:36 watson kernel:  [do_group_exit+45/112] do_group_exit+0x2d/0x70
Jul 20 08:10:36 watson kernel:  [<c01643ad>] do_group_exit+0x2d/0x70
Jul 20 08:10:36 watson kernel:  [syscall_call+7/11] syscall_call+0x7/0xb
Jul 20 08:10:36 watson kernel:  [<c0102e29>] syscall_call+0x7/0xb
Jul 20 08:10:36 watson kernel: Code: 83 c4 10 eb c3 6a 35 eb de 90 90 90 90 90 
90 90 90 55 89 e5 53 8b 45 08 8b 55 0c 8b 4d 10 8b 5d 14 66 83 78 06 00 74 1e
 83 ea 28 <8b> 42 48 89 01 8b 4a 04 8b 42 08 29 c8 8b 4a 48 c1 e8 0c 01 c8

all greek to me, this continues till about 12:15 am - interspesed with 
apparently successful disk writes by cyrus-imapd, df showed plenty of space 
though that was only after i rebooted the machine, i notice that every 
instance of these types of messages seems to have a 'Process curl (pid: 3466, 
threadinfo=cae66000 task=c0e105c0)' entry where the actual numbers are 
different each time, i have a cronjob that runs gotmail regularly and gotmail 
uses curl, so i'm thinking thats a good candidate, but whether this means a 
faulty disk i can't tell, how would i 'carefully' check the disk as opposed 
to the auto check after rebooting with a soft reset?

bascule
-- 
"janet!, dr. scott!, janet!, brad!, rocky!"