[Gllug] Pointer arithmetic with void *

Nix nix at esperi.demon.co.uk
Sun Aug 18 19:06:16 UTC 2002


On Fri, 16 Aug 2002, Tethys yowled:
[Alain said:]
>>	memcpy((void*)&f, p, sizeof(f));
>>	i = f.some_member;
> 
> That's not wise at all. Assume p points to a data stream consisting
> of an 8-bit char followed by two 16-bit shorts and a 32-bit int.
> That's 9 bytes. You'd define struct foo as:
> 
> 	struct foo
> 	{
> 		char c;
> 		short s1;
> 		short s2;
> 		int i;
> 	}
> 
> Yet sizeof(f) is almost guaranteed to be larger than 9 on any modern
> CPU, as the compiler will add in padding to get natural alignment.

Further, what about big-endianism versus little-endianism? What about
machines with different integral or float representations?

Either write out textual data or use htonl() and friends to write out
something consistently ordered. (In each case, you must do it field-by-
field, of course.)

C++ has the advantage here, without a doubt ;}

> In fact, I'm pretty sure the compiler is even free to reorder the
> structure contents as it sees fit to best arrange the data given

That isn't permitted. From the C99 draft that's the best I've got on
this machine (so the numbering may have changed in the final standard):

,----[ 6.5.2.1.12 ]
|        [#12] Within a structure object, the  non-bit-field  members
|        and the units in which bit-fields reside have addresses that
|        increase in the order in which they are declared.  A pointer
|        to  a  structure  object,  suitably converted, points to its
|        initial member (or if that member is a  bit-field,  then  to
|        the unit in which it resides), and vice versa.  There may be
|        unnamed padding within a structure object, but  not  at  its
|        beginning.
`----

(Note that this also states that if a structure is of the form

struct foo { double a; ... };

it is a requirement that &(foo.a) == &foo.)

> the alignment requirements (note that if the compiler does this,
> then it's possible to get the desired sizeof(f), but the values
> of the members won't be what you expect...)

Almost every compiler pads unless forced not to for a given object (or
type); indeed, on some CPUs (SPARC, perhaps MIPS), if you don't pad,
you'll take a trap from unaligned accesses unless you jump through
horrible hoops to force the accesses to be aligned despite the
nonalignment of the data.

So not padding is it's *slow* at best.

> The only safe way to get data into a struct is either one member at
> a time, or by a structure copy (i.e., assignment from another structure
> of the same type).

Agreed.

>>Some chips are not so forgiving - don't write CPU specific code
> 
> Agreed. Also, s/CPU/compiler/

Unless you have a damn good reason to (e.g. the Linux kernel's
GCC-specificity, and perhaps glibc's too).

(In a sense it's not *so* bad to make code GCC-specific, because at
least GCC is itself portable... but it's still not a good idea unless
you must, nice though some of its extensions[1] are.)


[1] I'm a fan of statement expressions and nested functions personally,
    but alas both play hob with C++ in particular so one isn't implemented
    for C++ and the other might not always work well :( I also liked
    __iterator__, but it was never documented and was eventually removed.
    A shame, it was a nice idea :( but without documentation it had zero
    chance of catching on and eventually landing in an official standards
    document, the way some GCC extensions have in the past.

-- 
`Mips are real and bitrate earnest, shifting spam is not our goal;
 silicon to sand returnest, was not spoken of the soul.'
   --- _Eventful History: Version 1.x_, John M. Ford

-- 
Gllug mailing list  -  Gllug at linux.co.uk
http://list.ftech.net/mailman/listinfo/gllug




More information about the GLLUG mailing list