multilingual

Archived Posts from this Category

Twanguages: a Language Census of Twitter

Posted by Jim DeLaHunt on 30 Jul 2009 | Tagged as: Unicode, language, meetings and conferences, multilingual, web technology

What “twanguage” do you “tweet”?  Twitter, the buzzing conversation of brief web and SMS messsages, exploded into wide use in 2009. But just how wide?  To how many countries has it spread?  And into which languages?  I’m aiming to find out.

I’ve started a project named “Twanguages”, a language census of a sample of Twitter’s global traffic. I’m curious: which are the top languages? Are #hashtags localised? How does language correlate with location?  And which Unicode character is the most rarely used?

I’ll be  presenting our results at the 33rd Internationalization and Unicode Conference (IUC33), held in San Jose, California, on October 14-16, 2009. I have a place cleared for a Twanguages project page, and I’ll post interim results there as they become available (right now it’s only a placeholder). Stay tuned!

Continue Reading »

“International and multilingual Drupal and Joomla! sites” at LinuxFest Northwest

Posted by Jim DeLaHunt on 29 Apr 2009 | Tagged as: CMS, Joomla, drupal, i18n, meetings and conferences, multilingual, web technology

“International and multilingual Drupal and Joomla! sites” slide previewLast week I gave a presentation, International and multilingual Drupal and Joomla! sites. I’ve posted my slides and handouts at that link for anyone who wants to catch up on them.

The occasion was LinuxFest Northwest 2009, held at Bellingham Technical College in Bellingham, WA, USA. It was a delightful event. It’s thoroughly grassroots and volunteer, it has a friendly and accessible vibe, yet it attracts very knowledgeable people.

Continue Reading »

Seeking listings for the BC Polyglot Blog Directory

Posted by Jim DeLaHunt on 19 Feb 2009 | Tagged as: British Columbia, language, multilingual

Do you know a blog which is by or for people in BC, and is in some language other than English?  If so, submit it for the BC Polyglot Blog Directory!

I created this directory in honour of the 2009 Northern Voice conference, which starts tomorrow at UBC. I wanted to highlight all those minority-language bloggers in BC.  In a little bit of searching I already have blogs in French, Traditional Chinese, and Japanese. I fully expect to find blogs in simplified Chinese and Punjabi as well. After all, 18% of people BC use a language other than English at home, according to Statistics Canada and the 2006 census.

I’ve created the directory on my site, at http://jdlh.com/en/pr/bc_polyglot_blogs.html. See the Rules and Q&A there for more information. You can submit listings for the directory by leaving a comment on this post, or by sending a message using that website’s Contact form for Jim DeLaHunt. Please supply the name of the blog, the URL, the language(s) in which it publishes, where the blog is located, and what geography it addresses.

I look forward to seeing this baby grow!

IIMA talk: “…successful multilingual web strategy”

Posted by Jim DeLaHunt on 31 Dec 2008 | Tagged as: Vancouver, meetings and conferences, multilingual

Right!  I was supposed to announce this three weeks ago!

I’ve posted the slides from my Dec 10 presentation, “Expand your reach with a successful multilingual web strategy”.  I gave this talk to the  Vancouver chapter of the International Internet Marketing Association (IIMA)’s monthly meeting.

Continue Reading »

Simple script-detection algorithm for font switching?

Posted by Jim DeLaHunt on 26 Aug 2008 | Tagged as: Unicode, i18n, language, multilingual, software engineering

Does anybody know of a simple script-detection algorithm (or heuristic) for font switching?

This came up with one of my clients. Suppose you have a guest book on your web site, and seven visitors left you the following inspiring messages:

  1. すべての人間は、生まれながらにして自由であり、かつ、尊厳と権利とについて平等である。
  2. 人人生而自由,在尊严和权利上一律平等。
  3. Semua orang dilahirkan merdeka dan mempunyai martabat dan hak-hak yang sama.
  4. 人人生而自由,在尊嚴和權利上一律平等。
  5. Alle Menschen sind frei und gleich an Würde und Rechten geboren.
  6. ‘Ολοι οι άνθρωποι γεννιούνται ελεύθεροι και ίσοι στην αξιοπρέπεια και τα δικαιώματα.
  7. 모든 인간은 태어날 때부터 자유로우며 그 존엄과 권리에 있어 동등하다.

(It looks like your visitors all read the Universal Declaration of Human Rights courtesy of the UDHR in Unicode project).

Now suppose you are so touched that you want to lay out all seven messages in a PDF file, and print it out as a booklet.  You have a beautiful layout template, and various complementary fonts: Latin script, Japanese, Korean, simplified Chinese, Traditional Chinese, and Greek script.

Which font to you apply to each message?  More importantly, is there a simple heuristic by which software can make the choice? (More after the jump.)

Continue Reading »

Jim presents to Joomla Day Vancouver this Saturday, June 14, 2008

Posted by Jim DeLaHunt on 11 Jun 2008 | Tagged as: CMS, Joomla, Vancouver, i18n, meetings and conferences, multilingual

There is a Joomla! Day in Vancouver this Saturday. I’ll be giving a brief presentation, on jdlh.com as an example of a multilingual Joomla! website, with human-friendly URLs.

Continue Reading »

“Web 2.0 goes to Babel: Multilingual websites and user-supplied content” at IUC32

Posted by Jim DeLaHunt on 31 May 2008 | Tagged as: CMS, Joomla, Unicode, i18n, meetings and conferences, multilingual

Oh right, I forgot to mention: I’ve been accepted to present to the 32nd Internationalization & Unicode Conference this September! I’m presenting on a topic which I’ve been working on lately: multilingual websites. The title is: Web 2.0 goes to Babel: Multilingual websites and user-supplied content.

Continue Reading »

Human-friendly URLs for a multilingual Joomla! site (jdlh.com)

Posted by Jim DeLaHunt on 05 Mar 2008 | Tagged as: CMS, Joomla, multilingual

I want my site, jdlh.com, to be a multilingual site that communicates the business I want to do and lets me explore the tools for being world-ready. For nearly two years, I’ve worked to get a combination of tools that would do the job. I’m happy to say that this week I finally assembled a plausible solution. The final piece was sh404SEF, after some patching, with Joomla! 1.0.x and Joom!Fish.

Language support on jdlh.com

jdlh.com supports content in multiple languages (English, Japanese, and German so far), and also a user interface in multiple languages (the same three now, but could differ). Each URL can include a language code between the domain name (”jdlh.com”) and the path to the content. The language codes look like “/en/” for English, “/de/” for German, and “/ja/” for Japanese. The codes are based on RFC 3066 . Where there is a language code in a URL, the site presents content localised for that language, to the extent possible. The content may not always available in that language, so the site may present the content in a fall-back language.

Where there is no language code in a URL, especially in the basic domain name http://jdlh.com/, the site looks at the HTTP Accept-Language header to determine which language the user prefers, and redirects the browser to content with that language code.

It’s important to me that the URLs of content on my site be  concise, comprehensible to humans, and stable over time. I like Jakob Nielsen’s “URL as UI” column, and the W3C’s “Cool URIs don’t change“, and try to follow them.

Software used on jdlh.com

jdlh.com is built using Joomla!, a free software content management system (CMS). Version 1.0.x of Joomla!, which I use as of early 2008, can be coaxed into using UTF-8 text encoding and tolerating multi-lingual content. I add in Joom!Fish, a Joomla component which helps manage content in multiple parallel languages, and provides useful language utilities like that UI widget at the top of the page, to select between languages.

Joomla has many strengths, but easy-to-read URLs aren’t among them. Left to itself, a Joomla URL is an opaque stream of numbers and codes. Turning those URLs into human-friendly URLs, which are concise, comprehensible to humans, and stable over time is the work of a “SEF” (Search-Engine-Friendly) component. Joomla has had several, but the first which satisifed us for jdlh.com is one called sh404SEF (see also sh404SEF on Joomla extensions and sh404SEF on siliana.net).

There has been a tough interaction between Joomla, Joom!Fish, and sh404SEF (and its ill-starred predecessors). Since mid-2006, Joomla would work with either of the other two, but not both together. Even as Joomla! moved forward to version 1.5.x, which has a better foundation for multilingual sites, I was held back to Joomla 1.0.x because Joom!Fish didn’t support the new version yet. Finally, in late February 2008, I discovered version 1.3.1 “TEST PR build 255″ of sh404SEF, which seemed to work well with Joom!Fish (currently 1.8.2) and Joomla (currently 1.0.15).

I made a patch to sh404SEF, one of the modules that extends the Joomla! content management system that runs this website. What the patch does is to ensure that all three of the languages supported on this website are treated equally in the URLs of this site. Without the patch, the “/en/” tag for URLs of English-language content would be missing in some cases. See my article “Default-language patch for sh404SEF published” for a description of the patch, and a link to the code.

Multilingual blogs and websites at Northern Voice 2008

Posted by Jim DeLaHunt on 28 Feb 2008 | Tagged as: Vancouver, meetings and conferences, multilingual

Last week was Northern Voice 2008 (Feb 22-23), a blogger’s conference here in Vancouver. It was held in UBC’s beautiful Forest Sciences Centre, in and around a pleasant sunny atrium lined with gorgeous wood panelling.

I convened a session on multilingual blogs and websites. I was interested in the issues that arise when we try to do all that cool blog or website activity in a second and third language. The first language is no problem; modern tools can handle almost any single language.

A great group of about 15 people joined in. We put our discussion notes on a page on the Northern Voice wiki (http://wiki.northernvoice.ca/Multilingual+blogs), so check that out to see what we discussed.

I walked in with a categorisation of the issues that arise as a website goes multilingual. This held up well in the discussion. Maybe you’ll find it helpful too.

  • Examples: who’s doing it in multiple languages, and how well?
  • Value: who needs multiple languages and what for?
  • Structure: how to link content in one language to another?
  • Content: is content in various languages the same or different? How/why different?
  • Tools: how to make your blogging system or service or CMS handle the text and connect to your translators?
  • Translation: how to get the content from one language to the other?
  • Process: how to make all the parts move together and on time?

I’m really interested in multilingual websites as a way to structure thought about world-ready technology, and as a focus for my consulting practice. Expect to hear more about it.

« Previous Page