I recently started a filing project. It requires labels printed on slips of paper, each with a name and ID number in nice big letters. I authored my labels in the SVG graphics format. But editing SVG files for each label is impractical, so I searched for a way to treat the SVG as a template, and fill it out with a spreadsheet of data. I found a ridiculously easy way to do it in Python — with only 9 lines of clever code.

For my project I am filing music CDs. The filing order is by the sort-name of the CD’s artist, combined with a universally unique identifier number (“MBID”) which I retrieve from the MusicBrainz project. (I am simplifying this description a lot, because the point of the story is SVG as template, not the music CDs.) I designed a label which I can print on my laser printer. It has the name and the ID number, a little like this:

Label reading "Bach, P.D.Q.", and below it a long number beginning "1526".
Example label, showing a simple design.

But I want to be able to generate hundreds of labels like this. It is no trouble for me to make a CSV file, with a row for each label, and a column for sort name and another for MBID. I looked for a way to turn the SVG file with the label into a kind of template, and to generate copies of the template with each successive name and mbid filled in. In word-processor apps, this is called “mail-merge”. In web applications, it is called “templating”. I searched, using those terms and more, for a tool that could do this and yet remain simple.

And then I hit on the ridiculously simple templating tool: the Python language and its string formatting. Look at the contents of the above SVG file. I have elided a lot of detail, but the substance is there:

<?xml version="1.0" encoding="UTF-8"?>
<svg
  version="1.1" id="svg5" xmlns="http://www.w3.org/2000/svg"
  width="100mm" height="40mm">
  <g>
    <text
      xml:space="preserve" x="5" y="10"
      style="font-size:5px;font-weight:bold;font-family:Helvetica;-inkscape-font-specification:'Helvetica Bold';fill:#000000"
    ><tspan>Bach, P. D. Q.</tspan></text>
    <text
      xml:space="preserve" x="5" y="15"
      style="font-size:3px;font-family:Times;-inkscape-font-specification:'Times, Normal';fill:#000000"
      ><tspan>1526a22e-9c12-4f8e-8648-b1ba362c635d</tspan></text>
  </g>
  <rect width="96" height="36" x="2" y="2" />
</svg>

Python is my first choice for simple programs to get simple tasks done. I frequently use Python’s string .format() function. It takes a string, replaces anything within curly braces according to a simple format string syntax, and returns that filled-out string:

name = "Bach, P. D. Q."
mbid = "1526a22e-9c12-4f8e-8648-b1ba362c635d"
print( "Name: {name}, ID: {mbid}".format(name=sortname, mbid=mbid))
Name: Bach, P. D. Q., ID: 1526a22e-9c12-4f8e-8648-b1ba362c635d

It is easy to imagine turning this in to a template-expanding program. Instead of assigning explicit values to sortname and mbid, read them from a CSV file. And, notice that the SVG syntax does not use curly braces. That means I can just edit the SVG file to replace the literal name and MBID strings with Python .format() placeholders. This is what the tspan element texts become.

...<tspan>{name}</tspan>...
...<tspan>{mbid}</tspan>...

None of the rest of the SVG file changes. It remains a valid SVG file, and web browsers are happy to display it as an illustration:

Label template reading "{name}" in dark letters; and below it a text reading "{mbid}", and a text "{name}".
The same label design, converted to a template for Python .format()

The Python program becomes:

template = # read SVG file into a str, assign it to template
csv_reader = # open CSV file and read it via Python's csv package
sortname, mbid = csvreader.next()
label = template.format(name=sortname, mbid=mbid)
# write label to a file with .svg extension.

Now in fact, I have couple of complications. First, the label is not wide enough to all of some of the longer sortnames I will encounter. I want to save space by eliminating spaces, punctuation, and other non-letters from sortname. For clarity, I will add the full name below the MBID. Here are two examples of what the label now looks like:

Label reading "BachPDQ", and below it a long number beginning "1526", and a name "Bach, P.D.Q.".
Label reading "BachJohannSebastian", and below it a long number beginning "24f1", and a name "Bach, Johann Sebastian".
Examples of a modified label design, with an abbreviated sortname field.

But how to get that compacted name into the template? Python has an easy way to indicate which characters are letters and numbers, the .alnum() method. I just have to go through the string, keeping the characters whcih are letters and numbers, dropping the rest, and reassembling the surviving characters into a new string.

I could add another column to my CSV file, the compacted name. Then I refer to that third column whenever I want the compacted name. But this means I have two copies of the name data. There is a chance for one to get corrected but not the other.

Another way is to write code to generate the compacted name at the moment it is needed. Here is a simple class, TemplateStr, which extends the string type, and has a property .letters which returns the compacted string:

class TemplateStr(str):
    @property
    def letters(self):
        """TemplateStr.letters : a str with non-letters removed."""
        return "".join([c for c in self if c.isalnum()])

So if we convert a sortname to a TemplateStr object, then sortname.letters is the compacted name.

sortname = "Bach, P. D. Q."
print( "Name: {name.letters} (i.e. {name})".format(
    name=TemplateStr(sortname)) )
Name: BachPDQ (i.e. Bach, P. D. Q.)

So we can incorporate {name.letters} into the SVG template. From a design standpoint, this has the advantage of leaving the template, and template author, in charge of where to use the compacted the name.

Label template reading "{name.letters}" in dark letters; and below it a text reading "{mbid}", and a text "{name}".
The label template can call for the .letters variation

The only change to our Python program is that it defines the TemplateStr class, and it fills out the template like this:

class TemplateStr(str):
    # ...see above for TemplateStr contents

label = template.format(
    name=TemplateStr(sortname), mbid=TemplateStr(mbid)  )

But dropping non-letters is not enough. We still have the problem that long names will overflow the width of the label. I want to be sure that all the information used for filing is visible on the printed label. So, another approach is to sort only on the first 5 characters of the sortname. Since that won’t be unique (there are several musicians named “Bach”), I will make a rule that we sort by MBID after considering the first 5 characters of the sortname.

We now need a way to make the first 5 characters of the sortname available to the template. The normal Python syntax for this is a slicing: sortname[:5] . Unfortunately, Python’s simple format string syntax does not include slicing. While it does include integer subcripts, e.g. sortname[5], that normally gives us only the 6th character of a string, not the first five characters.

But TemplateStr can help us again. What are integer subscripts, but a syntax which translates to the standard __getitem__() method? We can redefine that method to return a string sliced to the desired length. We turn the subscript (“key”) into a slicing, via the built-in function slice() . Then we pass that slicing to the __getitem__() method of our superclass, str.

class TemplateStr(str):
    @property
    def letters(self):
        """TemplateStr.letters : a str with non-letters removed."""
        return "".join([c for c in self if c.isalnum()])
    def __getitem__(self, key):
        return super().__getitem__(slice(key))

We can now use the subscript syntax to get length-limited strings in .format() templates:

sortname = "Bach, P. D. Q."
print( "Name: {name[5]} (i.e. {name})".format(
    name=TemplateStr(sortname)) )
Name: Bach, (i.e. Bach, P. D. Q.)

We incorporate the five-character name length to the template. Also, to make it easier to see the MBID, we add a rotated text field with the first four characters of the MBID, just before the name.

Label template reading "{name[5]}" in dark letters; a short text to the left reading "{mbid[4]}"; and below it a text reading "{mbid}", and a text "{name}".
The label template can use subscript notation to get length limits.

Now the labels have length limits. Unfortunately, they no longer drop the non-alphabetic characters:

Label reading "Bach,", and below it a long number beginning "1526", and a name "Bach, P.D.Q.".
A label for “Bach,-1526” showing the effect of length limits

A shortened name which ends in punctuation is awkward. It would sure be helpful to both drop non-letters and limit the length. But if we use {name.letters[5]} in the template, we get only the sixth character of the name — a space. The problem is that the TemplateStr.letters property returns a string. If we have the property turn that string into a TemplateStr object, then the template can use name.letters[5]. TemplateStr now reads:

class TemplateStr(str):
    @property
    def letters(self):
        """TemplateStr.letters : a str with non-letters removed."""
        return self.__class__(
            "".join([c for c in self if c.isalnum()])
        )
    def __getitem__(self, key):
        return super().__getitem__(slice(key))

(Remember that a Python object has a property self.__class__ which returns the class-object of that object. When you call a class-object, you construct a new object of that class. So in the letters property above, calling self.__class__(...) works like calling TemplateStr(...). And, by avoiding the name of the class, we make it easier to rename or extend the class.)

Once our Python program uses this latest version of TemplateStr, we can use both the .letters variation and subcript notation:

sortname = "Bach, P. D. Q."
print( "Name: {name.letters[5]} (i.e. {name})".format(
    name=TemplateStr(sortname)) )
Name: BachP (i.e. Bach, P. D. Q.)

Now the template has this:

Label template reading "{name.letters[5]}" in dark letters; a short text to the left reading "{mbid[4]}"; and below it a text reading "{mbid}", and a text "{name}".
The label template can use both the .letters variation and subscript notation

We can add design touches, like putting the full-length name in bold but grey, behind the the five-character name. The corresponding labels look like this, with the bold-faced name having letters only and with the correct length.

Label reading "BachP" in dark letters, followed by "DQ" in grey; a short number to the left reading "1526"; and below it a long number beginning "1526", and a name "Bach, P.D.Q.".
Label reading "BachJ" in dark letters, followed by "ohannSebastian" in grey; a short number to the left reading "24f1"; and below it a long number beginning "24f1", and a name "Bach, Johann Sebastian".
Examples of a label design, using length limits and dropping non-letters.

So, by adding one Python class definition, only 9 lines long, we have a ridiculously easy way to fill out SVG templates with standard Python, no aftermarket packages necessary. This combination of simplicity, brevity, flexibility, and power is what we software engineers call “elegant”. In the Python world we might call it “Pythonic”.