Archived Posts from this Category

I adopted Unicode character U+5B57 「字」!

Posted by on 28 Feb 2022 | Tagged as: Japan, language, Unicode, web technology

The Gold Sponsor of U+5B57 「字」

One fun thing I did, late in 2021, was to donate a bit of money to the Unicode Consortium to sponsor U+5B57 「字」, my favourite of their more than 144,000 characters. It is a silly thing, but also a bit noble, and a bit useful, and a bit interesting if one peels back the cover and looks at the mechanisms to which it connects. In other words, it is the sort of thing I like to do.

Continue Reading »

‘’tain’t right’, says he: storm in apostrophe

Posted by on 14 Jun 2015 | Tagged as: culture, language, Unicode

A friend pointed me to a interesting blog post, Which Unicode character should represent the English apostrophe? (And why the Unicode committee is very wrong.) by Ted Clancy, 3. June 2015. The argument: “The Unicode committee is very clear that U+2019 (RIGHT SINGLE QUOTATION MARK) should represent the English apostrophe…. This is very, very wrong. The character you should use to represent the English apostrophe is U+02BC (MODIFIER LETTER APOSTROPHE). I’m here to tell you why why….” [Emphasis in the original.]

I understand that there might be many people on this planet who actually don’t care about English language orthography concerning the apostrophe, contractions, and Unicode plain text representations thereof. Go ahead, skip this post and go on with your day. I am completely captivated by such questions. I started writing a quick reply, which grew to the point where it seemed better to host it on my blog than on Clancy’s comments page. Continue Reading »

A Technology Globalization meetup for the Vancouver Area: (3) Where, When, and How

Posted by on 28 Feb 2015 | Tagged as: culture, i18n, language, meetings and conferences, multilingual, software engineering, Unicode, Vancouver

Our little meetup now has a name: Vancouver Globalization and Localization Users Group, or VanGLUG for short. Follow us as @VanGLUG on Twitter.  We had an outreach meeting in late January. So it’s long past time to conclude this series of thoughts about VanGLUG. Part 3 discusses “Where, When, and How”. Earlier in the series were A Technology Globalization meetup for the Vancouver Area: (1) What, Who (Oct 31, 2014), and A Technology Globalization meetup for the Vancouver Area: (2) Why, Naming (Dec 31, 2014).


One challenge of an in-person meeting is where to hold it. The usual habit for such events is to meet in downtown Vancouver. This can be inconvenient, not to mention tedious, for those of us in Surrey or Burnaby. But I expect this is how we will start.

I would, however, be delighted if there was enough interest in other parts of the Lower Mainland to start up satellite groups in other locations as well.

Could we meet virtually?  In this day and age, it should be cheap and practical to do a simple webcast of meetings. Some may want to participate remotely. An IRC channel or Twitter “second screen” may emerge. But in my experience, the networking which I suspect will be our biggest contribution will come from in-person attendance.


In an era of busy schedules, finding a time to meet is likely an overconstrained problem. Our technology industry tends to hold meetings like this on weekday evenings, sometimes over beer, and I suspect that is how we will start. But it is interesting to consider breakfast or lunch meetings.

When to get started?  The arrival of Localization World 2014 in Vancouver got a dozen local localization people to attend, and provided the impetus to turn interest into concrete plans. After Localization world, we started communicating and planning. The net result was a first meeting in mid-day of Monday, December 8, 2014. Despite the holiday distraction, we were able to land a spot guest-presenting to VanDev on 6 essentials every developer should know about international. Our next opportunity to meet will likely be April 2015, perhaps March.


The Twitter feed @VanGLUG was our first communications channel. I encourage any Twitter user interested in monitoring this effort to follow @VanGLUG. We have 37 followers at the moment. We were using the twitter handle @IMLIG1604 before, and changed that name while keeping our followers. The present @IMLIG1604 handle is a mop-up account, to point stragglers to @VanGLUG.We created a group on LinkedIn to use as a discussion forum. This has the snappy and memorable URL If you use LinkedIn, are in the Lower Mainland or nearby, and are interested in localization and related disciplines, we welcome you joining the LinkedIn Group. We are also accepting members from out of area (for instance, Washington and Oregon) in the interests of cross-group coordination. But for location-independent localization or globalization discussion, there are more appropriate groups already on LinkedIn.

Subsequent communications channels might perhaps include a Meetup group (if we want to put up the money), an email list, an outpost on a Facebook page, and other channels as there is interest.

GALA (the Globalization and Language Association) is one of our industry organisations. It has a membership and affiliate list that includes people from the Vancouver region. I spoke with one of their staff at Localization World. They are interested in encouraging local community groups. I believe this initiative is directly in line with their interest: we can be the local GALA community for here.  They have included us in a list of regional Localization User Groups. We are also on IMUG’s list of “IMUG-style” groups.
Do you want to see this meetup grow? If so, I welcome your input and participation. You can tweet to @VanGLUG, post comments on this blog, or send me email at jdlh “at” Call me at +1-604-376-8953.

See you at the meetings!

A Technology Globalization meetup for the Vancouver Area: (2) Why, Naming

Posted by on 31 Dec 2014 | Tagged as: culture, i18n, language, meetings and conferences, multilingual, software engineering, Unicode, Vancouver

I am helping to start a regular face-to-face event series which will bring together the people in the Vancouver area who work in technology globalization, internationalization, localization, and translation (GILT) for networking and learning. This post is the second in a series where I put into words my percolating thoughts about this group.  See also, A Technology Globalization meetup for the Vancouver Area: (1) What, Who (Oct 31, 2014).

Happily, this group has already started. We held our first meeting on Monday, Dec 8, 2014. Our placeholder Twitter feed is @imlig1604; follow that and you’ll stay connected when we pick our final name. And we have a group on LinkedIn for sharing ideas. The link isn’t very memorable, but go to LinkedIn Groups and search for “Vancouver localization”; you will find us. (We don’t yet have an account on the service.)  If you are in the Lower Mainland and are interested, I would welcome your participation.

Continuing with my reflections about this group, here are thoughts on why this group should exist, and what it might be named.

Continue Reading »

A Technology Globalization meetup for the Vancouver Area: (1) What, Who

Posted by on 31 Oct 2014 | Tagged as: i18n, language, meetings and conferences, multilingual, software engineering, Unicode, Vancouver

The time has come, I believe, for a regular face-to-face event series which will bring together the people in the Vancouver area who work in technology globalization, internationalization, localization, and translation (GILT) for networking and learning.  The Vancouver tech community is large enough that we have a substantial GILT population. In the last few weeks, I’ve heard from several of us who are interested in making something happen. My ambition is to start this series off by mid-December 2014.

Continue Reading »

Choosing between UTF-8 and UTF-16: which has the better bytes-per-character ratio?

Posted by on 31 Dec 2010 | Tagged as: i18n, language, software engineering, Unicode

Software engineers sometimes are called on to specify which encoding a text file format should use.  These days, the top contenders for encoding are UTF-8 and UTF-16, both based on the Unicode Standard. One factor (amongst several, and perhaps not the most compelling) in choosing between them is storage efficiency: the number of bytes per character, or amount of storage per unit of text. If a given text takes a kilobyte of storage in UTF-8 and twice that in UTF-16, that’s a difference, which may be meaningful.

I recently looked for quantitative data about space efficiency of UTF-8 and UTF-16, and couldn’t find very much. Engineering discussions about storage efficiency are better informed by quantitative data than by opinion and supposition. I want to give one morsel of quantitative data more visibility, and clarify this issue. Continue Reading »

How about an IMLIG (Internationalisation, Multilingual, Localisation Interest Group) for Vancouver?

Posted by on 27 Jun 2010 | Tagged as: i18n, language, meetings and conferences, multilingual, Unicode, Vancouver, web technology

There is a lot of international, multilingual, and multicultural activity in Vancouver. Also, there’s a thriving tech scene. But there’s no place for the people in the intersection of those two circles — those interested in and working on the internationalisation, localisation, and multilingual aspects of technology projects — to get together and share ideas. I think there ought to be.

And I’ll even propose a name: IMLIG1604, the I18n L10n M11l I6t G3p (Internationalisation, Localisation, and Multilingual Interest Group) for North America’s 604 area code. If you can decipher the title, you’re in the club!

Continue Reading »

Twanguages: a Language Census of Twitter

Posted by on 30 Jul 2009 | Tagged as: language, meetings and conferences, multilingual, Unicode, web technology

What “twanguage” do you “tweet”?  Twitter, the buzzing conversation of brief web and SMS messsages, exploded into wide use in 2009. But just how wide?  To how many countries has it spread?  And into which languages?  I’m aiming to find out.

I’ve started a project named “Twanguages”, a language census of a sample of Twitter’s global traffic. I’m curious: which are the top languages? Are #hashtags localised? How does language correlate with location?  And which Unicode character is the most rarely used?

I’ll be  presenting our results at the 33rd Internationalization and Unicode Conference (IUC33), held in San Jose, California, on October 14-16, 2009. I have a place cleared for a Twanguages project page, and I’ll post interim results there as they become available (right now it’s only a placeholder). Stay tuned!

Continue Reading »

Will Machine-only Translation Always Fall Short?

Posted by on 22 Jun 2009 | Tagged as: culture, i18n, language

I encountered a new blog from my i18n tribe today, Localization Best Practices. Their post, “Pidgins and Creoles” or “Why Machine-only Translation Will Always Fall Short”, caught my eye.  It is interesting, even if I don’t fully agree with them.

Jonathan writes that, at a recent conference on localisation:

…an audience member asked me about machine translation, and if it would ever completely take the place of human linguists in the industry. I answered “No,” although I did concede that machine translation is consistently making strides and does have a place in the localization community. He then mentioned that a scientific group in Europe recently had success with a robot performing a live human appendectomy. He believed that if something that delicate could be automated, what made something a “simple” as language beyond the scope of machines and artificial intelligence?  I thought about his question and then simply said, “Because there are no pidgins or creoles for appendectomies.” Continue Reading »

Seeking listings for the BC Polyglot Blog Directory

Posted by on 19 Feb 2009 | Tagged as: British Columbia, language, multilingual

Do you know a blog which is by or for people in BC, and is in some language other than English?  If so, submit it for the BC Polyglot Blog Directory!

I created this directory in honour of the 2009 Northern Voice conference, which starts tomorrow at UBC. I wanted to highlight all those minority-language bloggers in BC.  In a little bit of searching I already have blogs in French, Traditional Chinese, and Japanese. I fully expect to find blogs in simplified Chinese and Punjabi as well. After all, 18% of people BC use a language other than English at home, according to Statistics Canada and the 2006 census.

I’ve created the directory on my site, at See the Rules and Q&A there for more information. You can submit listings for the directory by leaving a comment on this post, or by sending a message using that website’s Contact form for Jim DeLaHunt. Please supply the name of the blog, the URL, the language(s) in which it publishes, where the blog is located, and what geography it addresses.

I look forward to seeing this baby grow!

Next Page »