Representing Characters - Deyes High School (Page 3 of 3) in pdf

Representing Characters - Deyes High School Page 3

GCSE Computing A451

Unit 4.3 – Representing Characters

Unicode

With ASCII character sets we only use 1 byte per character, so we can represent a maximum

of 256 different characters... How many different characters could we represent with 2

bytes per character?

1 byte per character = 256 different characters...

00000000 to 11111111

2 bytes per character = 65,536 different characters!

0000000000000000 to 1111111111111111

The Unicode character coding system uses 2 bytes (16 bits) per character. This means that it

can represent 65,536 different characters, which is more than enough to represent every

currently used language in the world!

Unicode is actually a lot more complex than ASCII but the main thing to remember is that

using 16 bits means that all the characters and scripts in any language can be represented,

rather than having to use lots of different / extended character sets. Text stored in Unicode

is also therefore stored as a series of bytes, but 2 bytes per character.

Unicode can represent any language and it does so by the user selecting a specific code

page which is one portion of the total Unicode space. Each code page represents the chosen

language.

A further extension of Unicode expanded the scheme to 21 bits, so offering over 1 million symbols,

(2 to the power of 21 = 1,114,111). This is enough space for even dead languages such as Egyptian

Hieroglyphics to be represented.

Representing Characters - Deyes High School Page 3