[Wolves] Text size

Chris Procter Chris at foxonline.co.uk
Sat May 1 23:12:59 BST 2004


iirc in Java a char is always 16bits (unicode and all that)

All other things being equal a char is 8 bits, standard ASCII uses 7 bits,
giving 128 letters, numbers, and control characters from 0 (null) through to
127 (DEL) with the 8th being spare or a check bit to ensure integrity
meaning that a character always takes 8 bits which is the default word size
of modern computers (meaning its the smallest amount of info that can be
stored or retreived) so while you could compress plain text documents to
7/8ths of their size be only using standard ascii the mechanics of doing it
would be horendous, far harder then implementing zip.

Extended ascii uses 8bits (giving a potential 256 characters) to add some
interesting symbols such as £ (the A in ASCII does stand for American after
all) sometimes different variations of extended ASCII are used and you end
up with characters with accents appearing in odd places in text normally
replacing punctuation marks.

A standard keyboard will deal in 7bit standard ascii and uses the ALT or
Meta keys to access the extended characters.

Basically though if you have an 80byte plain text file it will contain 80
characters. a 2Tb file will contain 2million million characters or 2e12.


chris (whos beginning to regret saying yes when they asked him to be
sys-admin)
 


**********************************************************************
Any opinions expressed in this email are those of the individual
and not necessarily those of Fox Online.
This email and any files transmitted with it, including replies and
forwarded copies (which may contain alterations) subsequently transmitted
from Fox Online, are confidential and solely for the use
of the intended recipient.
If you have received this email in error please notify Fox Online by
telephone on +44 (0)121 693 1424.
**********************************************************************




More information about the Wolves mailing list