[Wolves] Text size

Sun May 2 01:14:54 BST 2004

On Friday 30 April 2004 17:52, sparkes wrote:
> On Fri, 2004-04-30 at 14:42, C J Coleman wrote:
> > ASCII is actually 7-bits, but I think it may be stored as one byte.

Correct. Character values 0-127 are ASCII and supposedly standardised, while 
128-255 ("top-bit-set" characters) have been basically a free for all that 
different vendors/programmers have shoved all kinds of weird and wonderful 
symbols into. Two common uses are for accented characters in non-English 
languages, and lines that can be used for dividing the screen into boxes, etc 
in character-mode applications (e.g. DOS applications).

> > The C 'char' primitive is one byte, with the positive values of a
> > 'signed char' being ASCII.  Hope that helps,
>
> I hate to be pedantic but a C char shouldn't considered to be any fixed
> bit size all things being relative char is only normally a short int.
> The only thing you can be sure of is that sizeof(char) will return 1
> whatever *real* size it has because the specification says it should.
> Short will be at least 16 bits and sizeof(short) might be 2, long will
> be at least 32 bits and int will be no shorter than short and no longer
> than long ;-)
>
> So on your 32 bit intel machine sizeof(int) and sizeof(long) will
> probably both return 4.

Unless you're using DOS/Windows compilers where an int is 16 bits and a long 
is 32 bits, betraying that platform's archaic origins. :)

> Because int is normally the natural word size of the machine, and your
> intel hardware is 32 bit which if you don't look at the odd examples
> should be right but probably isn't. 16, 32 and 64 bits are common with
> 128 bit specalist processors becoming common (your graphics card might
> use 128bit processors and local buses), 24 and 36 very, very rare but if
> they can exist then anything can in the future ;-)

Such unusual combinations do exist in antequated machines, as do bytes larger 
or smaller than 8 bits. These days, the 8-bit byte (or "octet") is more or 
less standard.

See definition #1 at http://en.wikipedia.org/wiki/Byte

> Some machines also need their data lined up in the memory so char,
> short, int and long might all be the natural wordsize (and multiples
> thereof) on such hardware.

I'm not aware offhand of a specific example of them all being the natural word 
size, but it is common for extra "padding" bytes to be inserted into C/C++ 
structs to make them line up with word boundaries. As a result, the size of 
the struct is larger than sizeof() all its members added up. Under GCC you 
can disable this behaviour using something like:

struct whatever {
	// Stuff
} __attribute((packed))

though I've never been in a situation where I've used it myself.

James