[Gllug] SIGSEGV error

Nix nix at esperi.org.uk
Tue Mar 11 22:16:42 UTC 2008

On 9 Mar 2008, salsaman at xs4all.nl told this:

> Say what ? 512 is the maximum path length on Linux, I don't see where the
> bugs were or how I misunderstood dynamic memory allocation (I presume you
> mean the sprintf). The memset is needed because readlink() does not append
> a  null byte - check the manpage.

OK, review follows, accompanied with links to the relevant standards
where possible (basically that means POSIX; I don't have a non-draft C
Standard because I'm a lazy bastard and never bothered to get the real
thing: POSIX is on the web, so linking to it is trivial).

(I'm giving this far more effort than it arguably deserves because I'm
feeling bored and pedantic. A lot of my more obscure divagations are
intentionally not described completely so that you can go off and try
to hack them up if you want to. That's the way to learn: try! :) )

Your first bug is before you do anything else. What standard are you
conforming to? If you don't say, what you get depends on compiler
arguments (with -ansi, you don't even get POSIX). You're using
readlink(), so presumably you want to conform to POSIX. (There's no
excuse for using Linux-specific references or code in something this

Hence, perhaps,

#define _POSIX_C_SOURCE 200112L

(The man-pages package has a lot of POSIX in man form in the 3p section,
which is generally more convenient than the official web pages.)

| #include <stdio.h>

You would need <string.h> for memset(), were using that sensible (which
it is not; see below).

You need <stdlib.h> for free(), and more to the point for malloc() too.
(From here on I will assume that the appropriate URL for POSIX
references to a given standard library function is obvious.)

strcmp() is in <string.h>, but you probably don't want to do the job
this way anyway, you want to fstat() file descriptors 0, 1, and 2
search for `STDIN'), and compare their st_dev and st_ino fields, which
are guaranteed to be unique to that file, even if they've been deleted

| #include <sys/types.h>

You don't need this; it is included by reference in <unistd.h>.

| #include <unistd.h>
| int main(void) {

Good, you're not ruined by impact with Herb Schildt, `int' it is :)

|  char buf1[512],buf2[512];

That should at the very least use the PATH_MAX definition from
<limits.h> to size its arrays. It's preferable to dynamically allocate
your arrays (using getconf() to determine the size limits), as some
platforms (e.g. the GNU HURD) have no static limit on pathname length
and thus don't define PATH_MAX at all.

Rather than messing around with counting the size of the symlink
returned from readlink() and null-terminating it, it's easier (although
less efficient: who cares in this case?) to zero-fill the memory. You
don't need to call memset() or calloc() to do that:

  char buf1[512] = {0};
  char buf2[512] = {0};

(The C Standard guarantees that subsequent bytes are initialized as they
would be had they been declared `static', i.e., for aggregates like
strings, arrays, and structures, filled with zeroes.)

|  int mypid=(int)getpid();

getpid() is very fast, so this is probably premature optimization.
(Also, C90, but not C99, forbids the use of non-literals in

|  ssize_t size;

(unnecessary, see below)

|  char *string;
|  sprintf(string,"/proc/%d/fd/1",mypid);

Oops. Crash.

`string' is a pointer. Where does it point? Oops, nowhere, it's
uninitialized garbage. Instant coredump. (sprintf() isn't initializing
it, it's *dereferencing* it and writing into the place it points to.
Arrays decay immediately into pointers and the same thing happens to

(One note: `string' is a really bad name for a variable.)

This idiom is the generally-accepted way to allocate memory in C:

if ((string = malloc (/*some number of bytes or a call to sizeof())) == NULL)
    /* OOM error handling */

... however, this doesn't help you in this case, because sprintf() has a
lethal flaw: you can't tell how much space it needs until *after* it's
done its job, which is too late, because it's written to all that
memory. (This is a lie. You can do it either by implementing your own
sprintf() format-string parser, making sure that it works *just* the
same way as the sprintf() parser in your OS, which is of course
impossible in the presence of OS extensions, so you end up having to
reimplement sprintf() as well, or you can do it by anonymously mmap()ing
the area into which sprintf() prints, to ensure that it is surrounded by
space which will trip a SIGSEGV if you write into it, and then catching
the SIGSEGV if it overflows, expanding the buffer, and retrying. This is
really quite annoying but is occasionally worthwhile.)

A slightly less kludgy thing to do is to use snprintf(), which at least
lets you truncate the output:

snprintf (string, PATH_MAX, "/proc/%d/fd/1", mypid);

Unfortunately snprintf() isn't available on some older platforms (I
don't have easy access to any nowadays but I think Solaris 2.6 was one).

btw, you don't need the /proc/self/fd stuff: /dev/stdin, /dev/stdout,
and /dev/stderr are more portable. However, they're not guaranteed to be
symlinks so you can't readlink() them. (Not that you need them in this
case, as I mentioned above.)

|  size=readlink(string,buf1,511);

If you've zero-filled the buffer,

readlink (string, buf1, PATH_MAX);

would be sufficient.

|  memset(buf1+size,0,1);

Unneccesary, pointlessly opaque, and likely inefficient (although recent
GCCs could convert that to a single assignment at compile-time). Try

buf1[size] = 0;


buf1[size] = '\0';

if you want to emphasise that this is a string dammit and contains
characters. C doesn't care which you use).

|  free(string);

A good rule of thumb for avoiding problems with memory allocation in C
is `don't free what you didn't allocate; free that only once'. This is
relevant to things like abstract datatypes implemented as a set of
cooperating functions: if such a set allocated memory, *it* should free
it, not one of its callers or someone else. However, in this case it's
even more relevant as you freed something you never allocated at all!

The best possible result from this is a crash.

[same comments about the second set]
|  if (!strcmp(buf1,buf2)) {
|    printf("The same !\n");
|  } else {
|    printf("Not the same !\n");
|  }

One problem is that this doesn't actually work.

e.g. this script:

| #!/bin/ksh
| touch /tmp/foo
| exec 3</tmp/foo
| rm /tmp/foo
| touch /tmp/foo
| exec 4</tmp/foo
| rm /tmp/foo
| ls -l /proc/$$/fd
| stat -L /proc/$$/fd/3
| stat -L /proc/$$/fd/4

yields this output (blanks inserted for readability):

| total 0
| lrwx------ 1 nix users 64 2008-03-11 22:14 0 -> /dev/pts/17
| lrwx------ 1 nix users 64 2008-03-11 22:14 1 -> /dev/pts/17
| lr-x------ 1 nix users 64 2008-03-11 22:14 10 -> /tmp/blah
| lrwx------ 1 nix users 64 2008-03-11 22:14 2 -> /dev/pts/17
| lr-x------ 1 nix users 64 2008-03-11 22:14 3 -> /tmp/foo (deleted)
| lr-x------ 1 nix users 64 2008-03-11 22:14 4 -> /tmp/foo (deleted)
|   File: `/proc/9970/fd/3'
|   Size: 0               Blocks: 0          IO Block: 4096   regular empty file
| Device: fh/15d  Inode: 4330121     Links: 0
| Access: (0644/-rw-r--r--)  Uid: ( 1000/     nix)   Gid: (  100/   users)
| Access: 2008-03-11 22:14:26.876326543 +0000
| Modify: 2008-03-11 22:14:26.876326543 +0000
| Change: 2008-03-11 22:14:26.878326742 +0000
|   File: `/proc/9970/fd/4'
|   Size: 0               Blocks: 0          IO Block: 4096   regular empty file
| Device: fh/15d  Inode: 4330122     Links: 0
| Access: (0644/-rw-r--r--)  Uid: ( 1000/     nix)   Gid: (  100/   users)
| Access: 2008-03-11 22:14:26.893322348 +0000
| Modify: 2008-03-11 22:14:26.893322348 +0000
| Change: 2008-03-11 22:14:26.895322295 +0000

Oops. They point at the same thing, only they don't.

But the inode number is different; fstat() can distinguish between them.

|  return 0;
| }

`The rest is a tale of post and counter-post.' --- Ian Rawlings
                                                   describes USENET
Gllug mailing list  -  Gllug at gllug.org.uk

More information about the GLLUG mailing list