multilingual

Archived Posts from this Category

Adventures with the Solar Hijri calendar

Posted by on 31 Jan 2019 | Tagged as: culture, i18n, multilingual, time

 Recently, an innocent attempt to correct an error, in a birth date cited in a Wikipedia article, led me to a lesson in the Solar Hijri calendar, used in Iran. It was another wonderful reminder about how interesting and subtle are the calendars and clocks across cultures. Cultures can can approach the task of keeping track of days and years so differently, despite all of us living on the same planet, orbiting the same star and watching the same moon. Continue Reading »

Top Posts: Why Unicode has separate codepoints for “characters with identical glyphs”

Posted by on 31 May 2018 | Tagged as: i18n, multilingual, robobait, software engineering, Unicode

I post on various forums around the net. Sometimes I am able to tap into such inspiration that I want to add that essay to my portfolio. Such was the case here. The question: Why does Unicode have separate codepoints for characters with identical glyphs? My response begins: The short answer to this question is, “Unicode encodes characters, not glyphs”. But like many questions about Unicode, a related answer is “plain text may be plain, but it’s not simple”.… Continue Reading »

Email addresses and domain names are NON-latin! Now what? (IUC41 tutorial)

Posted by on 28 Feb 2018 | Tagged as: i18n, meetings and conferences, multilingual, Unicode, web technology

Last fall I attended the Internationalization and Unicode Conference. That year was the 41st conference, or IUC41.  In addition to a presentation (described in a blog last October), I delivered a tutorial: Email addresses and domain names are NON-latin! Now what?  I should have blogged about my slides last October, but better late than never. Here are my slides. Continue Reading »

Universal Acceptance of non-Latin email addresses and domain names: how does your framework rate? (IUC41 presentation)

Posted by on 31 Oct 2017 | Tagged as: i18n, meetings and conferences, multilingual, Unicode, web technology

One of my treats each year is to attend the Internationalization and Unicode Conference. This year was the 41st conference, or IUC41.  As I often do, I made a presentation. This year, the title was, Universal Acceptance of non-Latin email addresses and domain names: how does your framework rate? I’d like to share my slides. Continue Reading »

A Technology Globalization meetup for the Vancouver Area: (3) Where, When, and How

Posted by on 28 Feb 2015 | Tagged as: culture, i18n, language, meetings and conferences, multilingual, software engineering, Unicode, Vancouver

Our little meetup now has a name: Vancouver Globalization and Localization Users Group, or VanGLUG for short. Follow us as @VanGLUG on Twitter.  We had an outreach meeting in late January. So it’s long past time to conclude this series of thoughts about VanGLUG. Part 3 discusses “Where, When, and How”. Earlier in the series were A Technology Globalization meetup for the Vancouver Area: (1) What, Who (Oct 31, 2014), and A Technology Globalization meetup for the Vancouver Area: (2) Why, Naming (Dec 31, 2014).

Where

One challenge of an in-person meeting is where to hold it. The usual habit for such events is to meet in downtown Vancouver. This can be inconvenient, not to mention tedious, for those of us in Surrey or Burnaby. But I expect this is how we will start.

I would, however, be delighted if there was enough interest in other parts of the Lower Mainland to start up satellite groups in other locations as well.

Could we meet virtually?  In this day and age, it should be cheap and practical to do a simple webcast of meetings. Some may want to participate remotely. An IRC channel or Twitter “second screen” may emerge. But in my experience, the networking which I suspect will be our biggest contribution will come from in-person attendance.

When

In an era of busy schedules, finding a time to meet is likely an overconstrained problem. Our technology industry tends to hold meetings like this on weekday evenings, sometimes over beer, and I suspect that is how we will start. But it is interesting to consider breakfast or lunch meetings.

When to get started?  The arrival of Localization World 2014 in Vancouver got a dozen local localization people to attend, and provided the impetus to turn interest into concrete plans. After Localization world, we started communicating and planning. The net result was a first meeting in mid-day of Monday, December 8, 2014. Despite the holiday distraction, we were able to land a spot guest-presenting to VanDev on 6 essentials every developer should know about international. Our next opportunity to meet will likely be April 2015, perhaps March.

How

The Twitter feed @VanGLUG was our first communications channel. I encourage any Twitter user interested in monitoring this effort to follow @VanGLUG. We have 37 followers at the moment. We were using the twitter handle @IMLIG1604 before, and changed that name while keeping our followers. The present @IMLIG1604 handle is a mop-up account, to point stragglers to @VanGLUG.We created a group on LinkedIn to use as a discussion forum. This has the snappy and memorable URL https://www.linkedin.com/groups?home=&gid=6805530. If you use LinkedIn, are in the Lower Mainland or nearby, and are interested in localization and related disciplines, we welcome you joining the LinkedIn Group. We are also accepting members from out of area (for instance, Washington and Oregon) in the interests of cross-group coordination. But for location-independent localization or globalization discussion, there are more appropriate groups already on LinkedIn.

Subsequent communications channels might perhaps include a Meetup group (if we want to put up the money), an email list, an outpost on a Facebook page, and other channels as there is interest.

GALA (the Globalization and Language Association) is one of our industry organisations. It has a membership and affiliate list that includes people from the Vancouver region. I spoke with one of their staff at Localization World. They are interested in encouraging local community groups. I believe this initiative is directly in line with their interest: we can be the local GALA community for here.  They have included us in a list of regional Localization User Groups. We are also on IMUG’s list of “IMUG-style” groups.
Do you want to see this meetup grow? If so, I welcome your input and participation. You can tweet to @VanGLUG, post comments on this blog, or send me email at jdlh “at” jdlh.com. Call me at +1-604-376-8953.

See you at the meetings!

A Technology Globalization meetup for the Vancouver Area: (2) Why, Naming

Posted by on 31 Dec 2014 | Tagged as: culture, i18n, language, meetings and conferences, multilingual, software engineering, Unicode, Vancouver

I am helping to start a regular face-to-face event series which will bring together the people in the Vancouver area who work in technology globalization, internationalization, localization, and translation (GILT) for networking and learning. This post is the second in a series where I put into words my percolating thoughts about this group.  See also, A Technology Globalization meetup for the Vancouver Area: (1) What, Who (Oct 31, 2014).

Happily, this group has already started. We held our first meeting on Monday, Dec 8, 2014. Our placeholder Twitter feed is @imlig1604; follow that and you’ll stay connected when we pick our final name. And we have a group on LinkedIn for sharing ideas. The link isn’t very memorable, but go to LinkedIn Groups and search for “Vancouver localization”; you will find us. (We don’t yet have an account on the Meetup.com service.)  If you are in the Lower Mainland and are interested, I would welcome your participation.

Continuing with my reflections about this group, here are thoughts on why this group should exist, and what it might be named.

Continue Reading »

A Technology Globalization meetup for the Vancouver Area: (1) What, Who

Posted by on 31 Oct 2014 | Tagged as: i18n, language, meetings and conferences, multilingual, software engineering, Unicode, Vancouver

The time has come, I believe, for a regular face-to-face event series which will bring together the people in the Vancouver area who work in technology globalization, internationalization, localization, and translation (GILT) for networking and learning.  The Vancouver tech community is large enough that we have a substantial GILT population. In the last few weeks, I’ve heard from several of us who are interested in making something happen. My ambition is to start this series off by mid-December 2014.

Continue Reading »

IUC38 tutorial, “Building Multilingual Websites with Drupal 7 and Joomla 3”

Posted by on 31 May 2014 | Tagged as: CMS, drupal, Joomla, meetings and conferences, multilingual, Unicode

I’m delighted and proud to have been invited back to give my tutorial to the 38th Internationalization and Unicode Conference (IUC38) this November in Santa Clara, California.

Title: Building multilingual websites in Drupal 7 and Joomla 3
Date: Monday, November 3, 2014, 10:30-12:00. Track 3, tutorial morning session 2.
Here’s my abstract:

A practical look at the language and locale capabilities of Joomla! 3 and Drupal 7, two leading free software content management systems (CMSs). They let you build more powerful, more international websites faster. We look at: their core internationalisation and locale services, and localisation of UI and content. Each platform just had a major release, with advances in internationalisation. You will leave with specific tips for building your own site. We don’t assume Joomla or Drupal experience, but do include material for advanced practioners. A good tutorial for web site product managers, web designers, developers, and managers of international web teams.

Continue Reading »

A good-practice list of i18n API functionality

Posted by on 30 Nov 2013 | Tagged as: culture, i18n, meetings and conferences, multilingual, software engineering, web technology

Think of the applications programming interface (API) for an application environment: an operating system, a markup language, a language’s standard library. What internationalisation (i18n) functionality would you expect to see in such an API? There are some obvious candidates: a text string substitution-from-resources capability like gettext(). A mechanism for formatting dates, numbers, and currencies in culturally appropriate ways. Data formats for text that can handle text in a variety of languages. Some way to to determine what cultural conventions and language the user prefers. There is clearly a whole list one could make.

Wouldn’t it be interesting, and useful, to have such a list?  Probably many organisations have made such lists in the past. Who has made such a list? Are they willing to share it with the internationalisation and localisation community? Is there value in developing a “good practices” statement with such a list?  And, most importantly, who would like to read such a list? How would it help them? In what way would such a list add value? Continue Reading »

For OpenDataDay 2013, a language census of Vancouver’s datasets

Posted by on 28 Feb 2013 | Tagged as: culture, meetings and conferences, multilingual, Vancouver

OpenDataDay 2013 was celebrated last Saturday, February  23rd 2013, at over 100 hackathons and work days in 38 countries around the world. The City of Vancouver hosted a hackathon at Vancouver City Hall, and I joined in. My project was a language census of Vancouver’s open data datasets. Here’s what I set out to do.

Open Data is the idea that governments (and other bodies) publish data about their activity and holdings in machine-readable form, with loose terms of use, for citizens and other parties to use, and build upon, and add value to. Open Data Day rallies citizens and governments around the world “to write applications, liberate data, create visualizations and publish analyses using open public data to show support for and encourage the adoption open data policies by the world’s local, regional and national governments”.  I’m proud that local Vancouver open data leader David Eaves was one of the founders of Open Data Day. The UK-based Open Knowledge Foundation is part of the organisational foundation for OpenDataDay, but much of the energy is from local groups and volunteers (for example, the OKF in Japan).

Vancouver’s Open Data Day was a full house of some 80 grassroots activists, with attendance throughout the day by city staff, including Linda, the caretaker of the Vancouver Open Data portal and the voice of @VanOpenData on Twitter.  I missed the “Speed Data-ing” session in the morning, where participants could circulate among city providers of datasets to talk directly about was available and what each side wanted. I’m told that national minister the Honourable Tony Clement was also there (who now is responsible for the Government of Canada’s Open Data portal data.gc.ca, but who also in 2010 helped turn off the spigot of open data at its source by killing the long form census). I saw Councilmember Andrea Reimer there for the afternoon working session, and listening to the day-end wrap-ups, tweeting summaries of each project. I won’t try to describe all the projects. Take a look at the Vancouver Open Data Day 2013 wiki page, or the tweets tagged #vodhd13 (for Vancouver), and tagged #OpenData (worldwide).

I gave myself two goals for the hackathon. First, provide expertise and increased visibility for internationalisation and multi-lingual issues among the participants. Second, work on a modest project which would move internationalisation of local data forward.

My vision is that apps based on Vancouver open data should be localised into all the languages in which Vancouver residents want them. Over 30% of the people in the Vancouver region speak a language other than English at home, says Stats Canada. That is over  700,000 people of the 2.9m people in the area. Now of course localising those apps and web sites is a task for the developer. My discipline, internationalisation (i18n), is a set of design and implementation techniques to make it cheaper and easier to localise an app or web site. At some point, an app or web site presents data sourced from an open data dataset. In order for the complete user experience to be localised, the dataset also needs to be localised. A challenge of enabling localisation of open data-sourced apps is to set up formats, social structures, and incentive structures which makes it easier for datasets to get localised into the languages which matter to the end users.

To that end, I picked a modest project for the day. It was to make a language census of the city of Vancouver’s Open Data datasets. The link is to a project page I started on the Open Data Day wiki. I intended it to be a simple table describing the Vancouver, but it ended up with a good deal of explanation in the front matter.  I won’t repeat all that, but just give a couple of examples.

The 3-1-1 Contact Centre Interactions dataset (CSV format) has rows like (I’ve simplified):

Category1     , Category2     , Category3          , Mode    , 2012-11, 2012-12, 2013-1
CSG - Licenses, Animal Control, Dead Animals Pickup, Voice In,      22,      13,     13

While the Animal Control Inventory Deceased Animals dataset (CSV format) has rows like (again, simplified):

ID,  Date      ,CatOther   , Description              ,Sex,ACO            , Bag
7126,2013-02-23,SDC        , Tan/black medium hair cat,   ,Duty driver- JT, 13-00033
7127,2013-02-23,Dead Budgie,                          ,   ,Duty driver-JT , 13-00034
7128,2013-02-26,Cat        , Black and White          ,F  ,               , 13-00035

Note that most of the fields are simply data: dates, numbers, codes. These do not need to be localised. Some of the fields, like the Category fields in the 311 Interactions, are English-language phrases. But they are pulled from a controlled vocabulary, and so could be translated once into the target language, and would not usually need to be updated when new data is release. In contrast, a few fields in the Animal Control Inventory dataset, e.g. CatOther, Description, and ACO, seem to contain free text in English. Potentially, every new record in the dataset represents a new translation task.

The purpose of the language census is to go through the datasets in the Vancouver Open Data catalogue, and the fields for each dataset, and simply identify which fields are data, which are controlled vocabulary, and which are free text.  It’s not a major exercise. It doesn’t involve programming. Yet I believe it’s an important building block towards the vision of localised apps driven by open data.

Incidentally, this exercise inspired me to propose another dataset for the Vancouver catalogue: a dataset listing the datasets. There are 130 datasets in the Vancouver Open Data catalogue, and more are on the way. The only listing of them is an HTML page intended for human consumption. It would be nice to have a machine-readable table in CSV or XML format, describing the names and URLs and formats of the datasets in some structured way.

I’m happy to report success at my first goal, also. Several participants stopped by to talk with me about language support and internationalisation. I’m hopeful that it will help the non-English localisation of the apps, and city datasets, happen a little bit sooner.

If you would like to help in the language census, the project page is a wiki, and you are welcome to make constructive edits. See you there! Or, add a comment below.

Next Page »