Q&A with Jukka Korpela, author of Going Global with JavaScript and Globalize.js

What’s the most important thing you want JavaScript developers to learn from this book?
By making use of free tools such as Globalize.js, developers can easily adapt their applications for new markets with a minimal amount of work. For example, adapting the format of a date or number for a different country requires a single library function call.

This book also goes into more complex operations and functions, but it’s important that developers first get a feel for simple data format localization.

What is Globalize.js and why is it so valuable for developing global software?
Globalize is a standalone, open source JavaScript library that help you to globalize your JavaScript code. Globalize lets you adapt your code to work with a multitude of human languages. You need not know the languages or their conventions and you do not need to manually code the notations.

Globalize includes locale data for more than 300 locales, including presentation of numbers, date notations, calendars, time zones. It is easily modifiable and extensible to cover new locales.

You devote a chapter to the finer points of Unicode. Why is it so important for developers to understand Unicode?
Unicode has become widely used on web pages, in applications, and in databases, but most IT professionals still have a rather limited understanding of it. The generality of Unicode—covering more than 100,000 characters from all kinds of writing systems—has its price: complexities and practical issues. These issues are often encountered in common operations such as string comparison and case conversions.

You’re based in Finland. What common mistakes do you see made by developers who have localized software for your locale?
The most common mistake is partial localization: a page or application appears to be in Finnish or Swedish, but on a closer examination, you’ll see English notations for data items. Even the most current software may use a date notation like 11/6/2012, which is not only incorrect by our language rules, but also ambiguous.

Often, menus contain a mix of Finnish and English items. You might also see a dropdown list of countries of the world, with names in Finnish but in an odd order, usually based on English-language alphabetization rules.

Mistranslations are not rare and may cause real harm, particularly in menus, buttons, and labels for form fields. An expert may understand the cause of the problem—someone has translated a short fragment of text with no idea of the context−but average users are simply confused and may revert to use the English-language site as a lesser of two evils.

HTML5 proposes new input attributes, such as date and number. But these elements pose challenges that many developers might not be aware of. Can you explain why?
Browser support is still limited, inconstant, and partly experimental. But in addition to that, these elements have not yet been defined and implemented with globalization in mind. They may be implemented using browser-driven localization, using the browser’s locale. Adequate localization would reflect the locale of the content, the web page.

These issues can be partly addressed using code that avoids improper localization. But although the new elements are promising in the long run, they should be regarded rather as interesting features to be tested and used in controlled situations, rather than used in normal production.

Going Global by JavaScript and Globalize.js

NOTE: We also offer an enterprise price for a PDF copy of the book to be shared across your company.

How to localize date formats using Globalize.js

Dates are often used as case studies to illustrate the risks of ignoring cultural differences. For example, the date 4/7/2011 could be taken to mean July 4th by some and April 7th by others.

Fortunately, the open source JavaScript library Globalize provides a relatively easy way of delivering properly formatted dates (and other culture-specific data types) to users anywhere around the world.

In this article, I will show you how JavaScript’s built-in functions may be used to display dates and their inherent limitations and inconsistencies. I will then turn to Globalize to avoid these issues.

The not-so-simple approach to dates

Using the toString, toDateString or toTimeString

The simplest way to write a Date value would be to use the toString method, such as today.toString(). It produces, by definition, a system-defined presentation of the date.

In practice, you get some English-language notation, such as “Sat Jul 04 2015 13:30:50 GMT+0300,” independently of the language of the page, or the browser, or anything. However, browsers may try to localize the time zone denotation in their own ways. The code document.write(new Date(2115,6,11)) gives different results based on different browsers. The following examples are from browsers on a Finnish version of Windows 7 Pro:

Internet Explorer: Thu Jul 11 00:00:00 UTC+0300 2115
Firefox: Thu Jul 11 2115 00:00:00 GMT+0300 (Suomen kesaaika)
Opera: Thu Jul 11 2115 00:00:00 GMT+0300
Chrome: Thu Jul 11 2115 00:00:00 GMT+0300 (Suomen kes�aika)
Safari: Thu Jul 11 2115 00:00:00 GMT+0300 (Suomen kesäaika)
Android: Thu Jul 11 2115 00:00:00 GMT+0300 (EST)

So three of the six browsers write the time zone using a name in the language of the underlying operating system. Only Safari gets it right; Firefox and Chrome mess up the letter “ä” in two different ways.

The best we can say about the toString method for Date is that it produces some human-readable presentation of the moment of time. The presentation is widely understood, but far from universally. It would not look good on a page otherwise in Greek, Chinese, or Thai.

Similar challenges apply to using the methods toDateString and toTimeString.

Beware of implicit Date toString conversions

In JavaScript, toString() often gets applied implicitly. For example, if the value of foo is a Date object, then any of the following statements causes a call to toString:

alert(foo);
document.write(foo);
document.getElementById('x').innerHTML = foo;
foo = foo + '';

Automatic conversion to strings are often a convenience, and many authors routinely make use of it, perhaps even without ever thinking about it. Thus, it is not always obvious from the code where data gets written in a manner that should be modified when localizing software. The convenience of automatic operations in JavaScript has drawbacks, too. The implicit conversion (or coercion) means that general, non-localized toString() methods are used.

The deceptive toLocaleDateString

It would be natural to expect that the toLocaleDateString method produces a localized presentation of the date, and it does. But, as a developer, the locale is beyond your control. Little does it help to have the date localized in Swahili when it should be in Arabic.

You may also get essentially different results on different browsers even with the same system. For example, writing a date on a Finnish Windows operating system, with user interface language set to French, I get:

Internet Explorer: dimanche 1 juillet 2012
Firefox: 11. heinäkuuta 2012
Opera: 11/07/2012

Globalize to the rescue

To localize the display of Date values, you could override the built-in toString method. For that, you would need code that converts a Date value to a localized string in a format that is suitable for the target locale.

Using the Globalize library, this would be easy:

Date.prototype.toString = function() {
return Globalize.format(this,'F'); }

This example uses a full-length (‘F’) format, which contains all of the date information. In many contexts, it would be better to display only subsets of this information, such a short date. To get more granular control, we need specific presentation formats, which we cover in more detail in Part II. But this code is an effective and lightweight way to protect against accidental non-localized writing of Date values.

Now suppose you have written, say, var today = new Date() in JavaScript. How would you display the date in a format that is understandable and unambiguous to the user? Let us first assume, for simplicity, that we know that the user is German-speaking and that a “long” or “full” format is to be used for the date.

Using the Globalize library, you could write the following:

<!doctype html>
<title>Globalize demo<title>
<meta charset=utf-8>
<body>
<p id=date></p>
<script type="text/javascript" src="globalize.js">
</script>

In the example above, the first two script elements are needed to refer to external library files. The files are assumed to reside in the same folder as the page, so that you can use just filenames as URLs. The files referred to can be downloaded via the Globalize page at GitHub. The page address still reflects the old name “jQuery Global” of the library, but the Globalize library is now totally independent of jQuery, though it reflects same general idea “write less, do more.” Using Globalize by no means excludes using jQuery, but neither does it require it.

The code uses the “plain” way of accessing an element with getElementById() and setting its innerHTML property. If you are used to jQuery, you would probably want to replace the assignment with a shorter construct:

$('#date').html(Globalize.format(today,'D','de'));

Depending on the actual date, the web page would display:

Mittwoch, 25. Mai 2011

And you didn’t need to know a single word of German to produce this text string.

Article excerpted from Going Global with JavaScript and Globalize.js by Jukka Korpella.

Now Available: Going Global with JavaScript and Globalize.js

I’m happy to announce the publication of Going Global with JavaScript and Globalize.js.

If you use develop websites or applications for more than one language or country, this book will help you improve your JavaScript code. And if you haven’t yet made use of Globalize.js, this book could save you many hours — as well as accelerate your globalization efforts.

Author Jukka Korpela has included plenty of hands-on code samples. Jukka also happens to be author of my favorite book on Unicode.

Readers of this book will learn:

  • How to adapt a JavaScript app to local conventions, such as date formats, systems of measurement, time zones, and more
  • How to leverage Globalize.js and the Common Locale Data Repository (CLDR) to support global applications
  • How to handle text input that falls outside traditional “A-Z” characters

The book is available in print on Amazon and Barnes & Noble. It will be available on international Amazon sites shortly.

And we also offer a PDF version that carries an enterprise license. This is ideal for companies with large development teams.

Finally, we are also planning Kindle and iBook versions. But I don’t expect them to be available until later this month or early January.

To learn more about the book and download a PDF excerpt, click here.