Going Global with JavaScript: Coming this Fall

JavaScript enables everything from simple online sign-up forms to complex web-based applications.

But there is not much information out there on how to effectively internationalize and localize JavaScript code.

Which is why I’m pleased to announce that Byte Level Books is publishing Going Global with JavaScript and Globalize.js.

The book is authored by globalization expert Jukka Korpela, who wrote my favorite book on Unicode: Unicode Explained.

Readers of this book will learn:

  • How to ensure an application is “world ready” — removing unnecessary language and culture dependencies
  • How to adapt a JavaScript app to local conventions, such as date formats, systems of measurement, time zones, and more
  • How to leverage the Common Locale Data Repository (CLDR) to support global applications
  • How to localize the user interface to address different cultural requirements and expectations
  • How to handle text input that falls well outside traditional “A-Z” characters

I’ll have more to share on the book as we get closer to publication. If you’d like to be notified when the book is published, be sure to sign up for the Global by Design newsletter or the Byte Level Books Twitter feed.


Visualizing Unicode

Unicode is one the great achievements of our era. It’s also incredibly intimidating.

So I love to come across videos and web sites that help demystify Unicode.

A week ago I came across a video created by jörg piringer that displays, in fast motion, nearly 50,000 Unicode characters. I’ve embedded it below:

The video lasts 33 minutes, and it still only displays about half of all Unicode characters. But even so, the video is a great tool to help people who have never heard of Unicode get a feel for how massive this encoding truly is.

But let’s say you want to see the ENTIRE Unicode set.

Fortunately, Andrew West has created a nifty web page that allows you to view all Unicode characters (fonts permitting) — and at your own leisurely pace. I highly recommend checking it out.

Here is a screen shot of one character:

Source: Michael Kaplan

The next Internet revolution will not be in English

This visual depicts about half of the currently approved internationalized domain names (IDNs), positioned over their respective regions.

Notice the wide range of scripts over India and the wide range of Arabic domains. I left off the Latin country code equivalents (in, cn, th, sa, etc.) to illustrate what the Internet is going to look like (at a very high level) in the years ahead.

This next revolution is a linguistically local revolution. In terms of local content, it is already happening. Right now, more than half of the content on the Internet is not in English. Ten years from now, the percentage of English content could easily drop below 25%.

But there are a few technical obstacles that have so far made the Internet not as user friendly as it should be for people in the regions highlighted above. They’ve been forced to enter Latin-based URLs to get to where they want to go. Their email addresses are also Latin-based. This will all change over the next two decades.

For those of us who are fluent only in Latin-based languages, this next wave of growth is going to be interesting, if not a bit challenging. In a Latin-based URL environment, you can still easily navigate to and around non-Latin web sites and brands. For example, if I want to find Baidu in China, I can enter www.baidu.cn. For Yandex in Russia, it’s yandex.ru.

But flash forward a few years and these Latin URLs (though they’ll still exist) may no longer function as the front doors into these markets.

Try Яндекс.рф. It currently redirects to Yandex.ru.

In a few years, I doubt this redirection will exist.

We’re getting close to a linguistically local Internet — from URL to email address. There are still significant technical obstacles to overcome. It will be exciting to see which companies take the lead in overcoming them — as these companies will be well positioned to be leaders in these emerging markets.

UPDATE: I’ve expanded on this topic in a recent article on IP Watch.

Gruber gives up on his ✪ IDN

Tech pundit John Gruber threw in the towel on his domain ✪df.ws.

He writes:

What I didn’t foresee was the tremendous amount of software out there that does not properly parse non-ASCII characters in URLs, particularly IDN domain names. Twitter clients (including, seemingly, every app written using Adobe AIR, which includes some very popular Twitter clients), web browsers (including Firefox), and, for a few months, even the Twitter.com website wasn’t properly identifying DF’s short URLs as links.

Worse, some — but, oddly, not all — of AT&T’s DNS servers for 3G wireless clients choke on IDN domains. This meant that even if you were using a Twitter client that properly supports IDN domains, these links stillwouldn’t work if your 3G connection was routing through one of AT&T’s buggy DNS servers.

There is still a lot of heavy lifting left to do among many software and hardware vendors before IDNs can go mainstream. Unless, of course, a country — say Russia or China — mandates their support and pushes the vendors along.

PS: I’ve updated my top-level IDN tracker.


Upcoming: Speaking at LocWorld and Unicode Conference

I’m happy to be not only attending but speaking at Localization World and Unicode Conference in October.

Here are the details on my sessions:

Localization World
Seattle, WA

  • October 6: International Search Summit
  • October 7: The Next Ten Years of Web Globalization
  • October 7: Making Your Website Truly Global — and No, We’re Not Talking About Language

Unicode Conference
Santa Clara, CA

  • October 19: Improving the Global Gateway: Established and Emerging Trends in Multilingual Navigation

If you’re planning to attend either event, please let me know. I’d love to meet.