Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon. Entire thread

Python is the most popular language

Name: Anonymous 2015-03-09 15:50

Name: Anonymous 2015-03-10 19:18

>>29
Abstraction is abstraction.
ASCII is abstraction of Latin character ranges, its just a really thin one and maps to a single byte. UTF-8(and its alternative 7-bit form UTF-7) are abstract unicode code point ranges represented as variable width byte stream with complex rules.
http://en.wikipedia.org/wiki/UTF-8#Examples
Consider the encoding of the Euro sign, €.

The Unicode code point for "€" is U+20AC.
According to the scheme table above, this will take three bytes to encode, since it is between U+0800 and U+FFFF.
Hexadecimal 20AC is binary 0010 0000 1010 1100. The two leading zeros are added because, as the scheme table shows, a three-byte encoding needs exactly sixteen bits from the code point.
Because the encoding will be three bytes long, its leading byte starts with three 1s, then a 0 (1110...)
The remaining 4 bits of this byte are taken from the start of the code point (1110 0010), leaving 12 bits of the code point yet to be encoded (...0000 1010 1100).
The remaining 12 bits are cut in half, and 10 is added to the start of each of the 6-bit blocks to make two 8-bit bytes. (so 1000 0010, then 1010 1100).

Newer Posts
Don't change these.
Name: Email:
Entire Thread Thread List