Top issues in Universal Acceptance of non-Latin email addresses and domain names (IUC45 session)

Posted by on 31 Oct 2021 | Tagged as: meetings and conferences, Unicode, Universal Acceptance

Two weeks ago was the Internationalization and Unicode Conference. This year is the 45th conference, or IUC45. I delivered a presentation: Top issues in Universal Acceptance of non-Latin email addresses and domain names. Here are my slides.

Continue Reading »

Canadian election mechanics, an immigrant engineer’s view

Posted by on 30 Sep 2021 | Tagged as: Canada, community, Democratic Reform, politics

Canada held a national election 10 days ago. I have watched and voted in US elections for 40 years — first in California, where I spent my early adulthood, and later Washington state. I have been watching elections in Canada for 15 years, since I immigrated in 2005. I first voted here in 2017, after becoming a citizen. But in this election, on 20. September 2021, I served as a poll worker for the first time. This gave me an insider’s view of how this election was run. As an engineer, I love the process and methods in use around me. I can’t resist writing down some of the differences in election mechanics, between this Canadian election, and the California and Washington election mechanics which I have experienced.

One issue. This election was about one issue: electing members to a national Parliament. There were no other races. Nothing from the province or city. By contrast, the US elections I know usually piled multiple races and initiative questions into a single election and a single ballot.

Elections Canada specimen ballot, with fictional candidate names
Sample Canadian national election ballot (Source: Elections Canada training manual)

A small, simple ballot. The ballot was a single slip of paper, slightly larger than the palm of my hand. The only issue was the general election to the Parliament. Canada’s current electoral system, the archaic “First-past-the-post” system, meant that voters at my location voted only on candidates for one electoral district. The above sample has four names, but our ballot had five names.

Very manual ballot marking. A voter filled out the ballot with a pencil or pen. They put an “X” or check-mark or solid fill-in in one of the circles. Then the voter folded the ballot back up, and (after tearing off a stub) put the ballot in the ballot box themselves. By contrast, for Washington elections the voter must fill in a space on the ballot in a way that a scanning machine can read. (The same is true for Vancouver municipal elections.) In California, I sometimes filled in scannable marks on a paper ballot, and sometimes tapped in choices on a voting machine’s computer screen.

One elections office. All the voting in this election, nationwide, was operated by a single office, Elections Canada. A separate organisation, Elections B.C., runs provincial elections, and a city department runs Vancouver municipal elections. By contrast, in both California and Washington, election operations are delegated to county-level elections offices. These offices run elections for municipal, county, statewide, and national races. In the US, I currently vote through the services of the Whatcom County Auditor’s Office.

Very specific geographical ballot boxes. Elections Canada divided the electoral districts into small, local “poll divisions”, each with a specific voting desk and ballot box. I was the Deputy Returning Officer for Poll Division 125 of Electoral District 59034 (Vancouver Centre). This corresponded to two condo towers, one on Robson Street and one on Hamilton Street. People at those addresses voted at my desk. If they waited in line, they waited with their neighbours. People at other addresses voted elsewhere. I was located in a room in the Vancouver Public Library’s main branch. Our room had perhaps 10 voting desks and 12 poll divisions. Some desks and ballot boxes embraced two poll divisions. One curious side effect of this is that some voting desks had long lines, and some had none, depending on how many neighbours turned out to vote. There was no taking the next open voting booth of several equivalents, as in California. And of course, Washington has 100% mail-in voting, so it is much different.

Very voter-friendly rules. As right-wing politicians in various US states try to set up voting rules to exclude participation by citizens they don’t want, it was refreshing to see Elections Canada operate by voter-friendly rules. Voters could register on election day. Voters who had moved but not updated themselves on our registry could update their address on the voter rolls. And in particular…

Voter ID was not evil. In the US, requiring voters to show identification is branded as a right-wing tool for voter suppression. This works by limiting the acceptable identification to a short list which the suppressed voters are less likely to have. In contrast, Elections Canada accepted documents from a long and very flexible list as identification. And, for voters who had none of those documents, they could still vote if another voter vouched for them.

Very manual ballot counting. At the end of the voting day, we closed our doors to voters, and then spent an hour counting the votes in our ballot box by hand. As Deputy Returning Officer, I cut open my corrugated cardboard ballot box, and read each ballot myself. Another poll worker, who had other duties during the day, sat beside me and tallied the votes — and provided a check that I was not misreading. We then recounted and double-checked all ballots. We packaged ballots up into a series of envelopes, by hand, and sealed then signed each. We filled out a paper form with the Statement of the Vote for our poll division, by hand, making three carbonless copies.

Very manual results aggregation. How did the results get to Elections Canada, for aggregation into overall riding results? By the supervisor of my location calling the district office of Elections Canada, then coming to my desk, reading the numbers from my Statement of the Vote form to the district office. There was shouting to be heard over background noise. There was a frustrated repeating of misheard numbers. There was nary a web-hosted tally form in sight.

Security through simplicity, wide delegation, and many eyes. Of the 52,039 ballots cast in Vancouver Centre, 120 were cast in my polling division’s ballot box. I know exactly how many votes each candidate got. One the three copies of the Statement of the Vote form came home with me. And, I was present in the election room all day. All the ballot boxes in the room were sealed and on public display. I have high confidence that there was no gross tampering or ballot-box stuffing at our location. (In contrast to, say, this reporter’s experience at polling places in Tatarstan during the recent Russian election.) I am confident that no voting machine misrecorded votes, because there was no voting machine. I know that voters verified what their ballot said, because the ballot is simple, and each voter controlled the marks on their own ballot. Now there are limits to my confidence. I don’t have visibility into how Elections Canada aggregated my results into the total of 52,039. I wish that I could see a preliminary report of polling division results, to check against what I wrote in my form, before the results are declared final. But overall, I could verify more of the leaf nodes of the election tree in Canada than I could in Washington or California.

A very, very long day. The flip side of simplicity is lots of manual work. The downside (one of many) to holding an election during a pandemic is that many people who would ordinarily take the poll worker job declined. Elections Canada was scrambling for poll workers. My spouse and I signed up in part because we were younger, vaccinated, and thus less at risk; we felt we had a patriotic duty to step in. But they wanted us to work the whole day. We reported at 05:30h, and weren’t released until about 22:00h. We had only one meal break, and a couple of bio breaks. It was an interesting day. It was a fulfilling day. But boy, it was a looooong day.

Texas pro-life whistleblower website

Posted by on 31 Aug 2021 | Tagged as: culture, politics, robobait, USA

Bless their heart, people in Texas have set up a pro-life whistleblower web site to try and persuade Texas to anonymously report each other for personal medical decisions about abortion.

These folks, “Texas Right to Life”, say they want to enforce the Texas Heartbeat Act, which claims to let people sue each other based on reports like this. This is the same faction which claims personal choice over a medical decision like wearing a mask or getting a shot to prevent unnecessary deaths, but then forbids choice when it comes to abortion.

The good news is, someone has set up a similarly-named, but good, web site: . Go to that web site to find out about detectable heartbeats and standard medical practice and why abortions should not be illegal. Maybe, some people looking for the snitch website will find the good website instead. Let’s hope the good website is the first result search engines return for a search like “report abortions in Texas” — and that the bad website is waaaay down in the search results.

But how internet search engines come up with the order of search results? By looking at what other web pages link to each website. My blog is small, but the links on these pages will help in their small way to push the good result up in the search results. Do you have a web site or blog? You could link to the good web site also.

Now, another thing people are doing is gumming up the bad web site with spurious reports. I won’t link to the bad site here, but it has the same URL as the good web site, except use “.com” instead of “.net”. You currently can’t connect to the bad site except from an internet address inside the USA. You can’t see the anonymous report form except from an internet address within Texas. But there (V) are (P) ways (N) to arrange to have a Texas internet address.

To fill out a report, have the following information: How do you think the law has been violated (500 chars), How did you obtain this evidence (200 chars), Clinic or Doctor this evidence relates to (20 chars), City (30 chars), State (30 chars), Zip (30 chars). You must answer, Are you currently elected to public office? with Yes or No, and check “I am not a robot”. Now, I read that many people are submitting reports with false information. I hope they are being careful. Sites with report forms like this can easily filter out clearly bogus reports (e.g. state is not Texas, or Zip does not match City, or it mentions someone famous who is not an abortionist). It is harder to filter out plausible-sounding reports. Some anti-abortionist will have to spend effort to check them out. The more effort they waste, the less this bad website helps them.

Of course, this being the internet, someone has made another website, , to have “fun” with the bad web site by automatically generating false reports and submitting them via your internet address. I found it interesting and worthwhile.

Search engines, hear my keywords, and raise up my links! Texas Heartbeat Act! Prolife Whistleblower Web site!

I learned about software engineering from that: the exception

Posted by on 31 Jul 2021 | Tagged as: robobait, software engineering

I have been writing software for a long time. But every now and then I get flummoxed when my code misbehaves. It can take me a long time to figure out what is wrong. Often, it turns out to be a combination of basic mistakes which I have made, which remind me of some basic lessons in software engineering, which I didn’t realise that I was not applying. This happened to me recently. It is a story I call, “the exception”. Maybe, if I tell you the story, you can learn from my mistakes, and I can reinforce the lesson to myself.

I had been working on a set of command-line tools for a consulting client in recent years. The purpose of the tools is to take some input files, modify them in related ways, and write out the modified files. (This whole explanation is very simplified, to protect the client’s confidentiality, and to clarify the story.) Let’s say that I started off with a tool that would merge input file A and input file B, and write output file AB. Then as an experiment, I made a tool which would modify the contents of file B during the merge with A, and write output file ABʹ. I put this experiment away in a branch for quite a while. In the meantime, I extended the tool program so that it would take a directory D in place of file B, and merge file A with every file in D. So if D contained files D1, D2, D3, …, the tool merged and wrote AD1, AD2, AD3, etc.

The tools were a set of Python packages and modules. There was a module,, which was a command-line tool. There was also a package, modifiers, with four modules:, to do the basic work common to all tools; and, to do the specific work of the merge tool. There were also other modules for other modifications, represented by They don’t matter for this story. Finally, like every Python package, modifiers had an module to provide an interface that hides the internal structure of the package.

The code structure was something like this:

src/    # command-line program
    modifiers/          # package to do file modifications       # package interface         # basic work for all tools          # specific work of merge tool        # specific work another tool 

There were some further details which matter to the story. Each of the modules in modifiers defined a class to do its work: Common to do the work for all tools, Merge to do the work of the merge tool, etc. Class Merge and other classes inherited from class Common. So, gathered and checked command-line arguments, including paths for the input and output files, then passed them to an instance of class Merge to do the work. Merge in turn did some preparation to express the work in terms of what Common could do, then called superclass methods in Common to complete the work.

merge also defined some exceptions, represented here by exception InputFileNotFound. If class Merge could not find one of the files given in its path arguments, it raised this exception. (Classes like Merge never interact with the user; their caller has that responsibility.) caught this specific exception, and printed out an appropriate error message, and continued with any further work it could do. also caught any other exception. It would be unexpected and the tool had to stop. printed a general error message and stopped.

Finally, there was a test module for almost every code module. The only one that matters for this story was, which had tests of running the entire command-line tool. One test was test_missing_file(), which ran the tool with a path to an intentionally-missing input file, and checked that the tool printed the correct error message. I didn’t notice at the time, but this test contained my first error, which I will discuss below.

Thus, in more detail, the structure of the code was like this:

src/    # command-line program
       imports InputFileNotFound from modifiers
       imports Merge from modifiers
       creates instance of class Merge
       calls Merge instance with file paths etc.
       handles InputFileNotFound, prints specific message
       handles any other exception, prints general message  # tests merge_command overall
            calls merge_command with path known to not exist
            confirms that merge_command raised exception
            checks error message against expected text
    modifiers/          # package to do file modifications # package interface
            imports Common from common
            imports Merge from merge
            imports InputFileNotFound from merge
            # more, but details don't matter for this story # basic work for all tools
            defines class Common # specific work of merge tool
            imports Common from modifiers
            defines class Merge, inherits from Common
            if file missing, raise InputFileNotFound
            defines InputFileNotFound # specific work another tool
            # details don't matter for this story

At first, only class Merge worked with paths to directories. Other classes in package modifiers worked with paths to individual files. But then I wanted other classes to work with paths to directories also. At that point, I decided to abstract out the parts of Merge which worked with directories and iterated through the files, into a new class Directories, which inherited from Common. I changed Merge to inherit from Directories instead. did not need to change, because package modifiers protected it from the details of the package implementation. This is normal object-oriented design and normal refactoring.

I copied into a new module, and began deleting the parts that were about merging rather than about directory handling. I now defined InputFileNotFound in I added new entries to to import class Directories and InputFileNotFound. I changed so that class Merge inherited from class Directories instead of class Common.

But as I created, I made my second mistake: I failed to delete the existing definition of InputFileNotFound in, and I failed to delete the import of InputFileNotFound in

However, all my tests passed. Particularly, test_missing_file() passed. Thus I knew that the command still printed the same error message, when a path it was given pointed to a file which did not exist.

At this point, the code looked something like this:

src/    # command-line program
       imports InputFileNotFound from modifiers
       imports Merge from modifiers
       creates instance of class Merge
       calls Merge instance with file paths etc.
       catches InputFileNotFound, specific message
       catches any other exception, prints general message  # tests merge_command overall
            calls merge_command with path known to not exist
            confirms that merge_command raised exception
            checks error message against expected text
    modifiers/          # package to do file modifications # package interface
            imports Common from common
            imports Directories from directories
            imports InputFileNotFound from directories
            imports Merge from merge
            imports InputFileNotFound from merge # mistake!
            # more, but details don't matter for this story # basic work for all tools
            defines class Common # directory handling for all tools
            imports Common from modifiers
            defines class Directories, inherits from Common
            if file missing, raise InputFileNotFound
            defines InputFileNotFound # specific work of merge tool
            imports Directories from modifiers
            defines class Merge, inherits from Directories
            # no use of InputFileNotFound any more
            defines InputFileNotFound # mistake! # specific work another tool
            # details don't matter for this story

The code stayed in this structure for quite a long time. I had made two mistakes, but I wasn’t aware of them. My tests passed. My client was satisfied.

But recently, my client asked for improvements to the tool. These involved bringing the old experiment, wrote output file ABʹ, up to date with the current code. Specifically, I merged the current code of the main branch into the old code of the experimental branch. A number of modules conflicted, as I had expected. Many of the conflicts had to do with directory handling, and class Directories. The experimental code predated this new class. I worked to resolve the conflicts. This involved looking at the main branch’s code, and the experimental code, and their differences, and deciding what to take from the main branch, what to take from the experimental branch, and what to rewrite to reconcile the two. I wanted to have one set of code on the experimental branch which preserved all the current behaviour of the tool, and also had the experimental behaviour.

As I was resolving these conflicts, I noticed that imported InputFileNotFound twice. The second import, from merge, was no longer correct. I deleted it. I then discovered the left-over definition of InputFileNotFound in, and deleted it also.

All of a sudden, the test, test_missing_file(), began failing. The output from was different than my reference expected text. I checked, carefully, for the edit when the problem arose. It arose when I merged the main code into the experimental. Perhaps I had made a mistake in merging the conflicts? But none of code for directories or missing files was in conflict, because it hadn’t existed in the experimental code. I ran in a debugger. The InputFileNotFound was raised, and handled, (apparently) as expected.

This kind of problem is confusing, frustrating, and demoralising for a developer. There is a temptation to run the same tests the same way several times, hoping for a different result. But that is insanity.

The way out of a problem like this is similar to the way out of a frozen lake after you have fallen through the ice: crawl gradually, remaining at a low level, testing assumptions, until on solid ground.

I read the exception handling in, and it was simple and very clear: there was code to handle an InputFileNotFound, and I could not see how it would fail. But then I used a debugger to step through the test_missing_file(), and was astonished. The Directories class raised an InputFileNotFound exception, as expected. The handler which should have handled the InputFileNotFound unexpectedly did not handle it. The general handler, to catch any other exception, handled it instead. When I examined the exception’s value, it looked just like an InputFileNotFound. But when I used the Python expression to see if it (ex) was what it thought it was — isinstance(ex, InputFileNotFound)— the answer was False.

How could an InputFileNotFound exception not be an InputFileNotFound?

Eventually, the answer came to me. The name InputFileNotFound was defined twice: in, and an old left-over definition in The Directories class used the name in to create and raise the exception. The modifiers package interface in imported the name twice, the first time from directories, and the second time from merge. The second definition was what got when it imported the name from modifiers.

The InputFileNotFound handler in was matching against the exception defined in, rather than the correct one defined in Even though they had the same name, they referred to different classes. The Computer Science name for this is “aliasing”. When used correctly, it is wonderful. When used unintentionally, it has effects that are really, really confusing.

My first software engineering mistake was to not delete the old definitions in, and imports from, when I extracted the exception to If I had deleted the old definition, then the import in would have caused an error, and I would have known to remove it. When I finally did this cleanup, as part of the branch merge, then finally got the correct definition for its InputFileNotFound name.

But that means that had been handling exceptions wrong the whole time. How had the tests passed? Well, that was my second software engineering mistake. I implemented first, then wrote the test_missing_file() code to check that it always handled the exception the same way it did at first. I failed to check the nature of that handling for correctness. I failed to check for the InputFileNotFound handler’s “specific message”. Instead, I took the (incorrect) “general message” as the reference correct output. This is sloppy test writing.

Thus, the failing test test_missing_file() was the sign of a change in the code under test, all right, but it was a change from broken to working!

So, I learned about software engineering from that. My three lessons are:

  1. When refactoring, be careful to clean up lingering old code in the original location.
  2. When writing tests, be careful to check that the code’s behaviour is in fact correct, before setting it into the concrete of the text fixture.
  3. When faced with a confusing, frustrating failure, letting go of beliefs about the code, and proceeding with deliberate debugging, is an effective way to diagnose the problem and get back to solid ground.

IUC45 talk: “Top issues in Universal Acceptance”

Posted by on 30 Jun 2021 | Tagged as: meetings and conferences, Unicode, Universal Acceptance

I’m delighted to be presenting, once again, to the 45th Internationalization and Unicode Conference (IUC45).  The conference is the gathering of my “tribe”, people who are as enthusiastic about language, text, and software as I am. If you like this stuff, it’s the best place in the world to be for those three days. Or, given the pandemic, the conference might be partially or completely virtual, so that webcast is the best UDP session in the world. In either case, please register and join us there.

Continue Reading »

How to update the JRE in LiClipse

Posted by on 28 May 2021 | Tagged as: robobait, software engineering, technical support

I use the LiClipse software development environment (IDE) for most of my software development. It is a distribution of the Eclipse Project IDE, with PyDev, and its own Java runtime environment (JRE), and some extras. I like it because I can, by installing more components, work with more software languages. Most of the time, one can update the various components of an Eclipse IDE using the internal “Check for Updates” function. However, my installation had reached the point where updating the components required upgrading the JRE. The documentation for the latter upgrade was inadequate. Here is what I figured out. It worked for me. I hope these instructions are helpful for others.

Continue Reading »

How to fix table contents turned to “0” in LibreOffice

Posted by on 30 Apr 2021 | Tagged as: robobait, software engineering, technical support, web technology

Well, that was a fright. I was editing a large report with the LibreOffice word processor. I had a table of results. It consisted of a placeholder table — a header and a couple of rows — with Table Styles applied to get the rows formatted right. I pasted in dozens of rows of content from a spreadsheet. I saved the file for the night. The next morning, I opened the document, and to my horror, found that text values in many of the columns had been turned to “0”. This is LibreOffice bug 131025. And here is how I recovered from the error, and got my table contents back. I hopes it helps others who encounter this bug.

Continue Reading »

How to change and delete attributes in an XML file using XSLT

Posted by on 31 Mar 2021 | Tagged as: robobait, software engineering

I recently had a need to modify a large XML file by changing attributes of one kind of element — changing the value of a certain attribute, and deleting another attribute — while preserving everything else: element text, surrounding elements, etc. I did not readily find code for modifying an attribute value in my searches. Here is what worked for me. Perhaps it will be helpful for someone else.

Continue Reading »

InDesign 20周年 — 20 years of InDesign-J

Posted by on 28 Feb 2021 | Tagged as: culture, Japan

On 6. February 2021, a day-long series of web presentations in Japan celebrated the 20th anniversary of the Japanese version of Adobe InDesign. I tuned in from British Columbia, late Friday night my time. Mixed in with tips about the latest InDesign features were reminescences by the key Adobe employees from the time about how InDesign Japanese Version came to be. It turned out to be more than some tales of just another product. It was a warm reunion, particularly on the chat threads. It was the story of a singular gathering of Japanese printing expertise at a singular time of transformation. It was for many of the speakers a capstone event of their careers.

Continue Reading »

How to use XSLT to modify XML files inside .ODT documents

Posted by on 31 Jan 2021 | Tagged as: robobait, technical support, Uncategorized

The .ODT word processor documents produced by LibreOffice and Apache OpenOffice are in fact ZIP archives consisting of XML files, other files, and directories. It is actually straightforward to crack open the archive and get at the files and directories within. To edit the XML files, or just to explore them, XSL Transformations (XSLT) are a useful tool. This is a look at how to use XSLT and the xsltproc tool on XML files extracted from .ODT documents.

Continue Reading »

Next »