Unicode: Bigger and Better

The Unicode Consortion has released Version 4.0. For those not familiar with it, Unicode is “the fundamental specification for the representation of text, at the core of all modern software, programming languages, and standards, including Windows, Java, C#, Perl, XML, HTML, DB2, Oracle, and many others.”

How is Version 4.0 better than previous versions? Here’s what the press release says:

Version 4.0 encodes over 96,000 characters, twice as many as Version 3.0, and includes two record-breaking collections of encoded characters. The largest encoded character collection for Chinese characters in the history of computing has doubled in size yet again to encompass over 2000 years of Chinese, Japanese, Korean, and Vietnamese literary usage, including all the main classical dictionaries of these languages. Version 4.0 also encodes the largest set of characters for mathematical and technical publishing in existence.

Unicode is a remarkable achievement. I highly recommend taking a few moments to visit the Web site: www.unicode.org.

John Yunker
John Yunker

John is co-founder of Byte Level Research and author of Think Outside the Country as well as 19 editions of The Web Globalization Report Card.

Articles: 1498