Back in May, as part of the Music Encoding 2016 conference in Montreal, we had a discussion about comparing digital scores. Just as you can compare text files, and get a concise statement of differences, we brainstormed about requirements for comparing music scores at the notation level. This blog post is a record of that discussion.

Time: 20. May, 2016 11:00-12:00 (approx)
Location: Music Encoding 2016 conference, unconference day, McGill University, Montréal, Québec, Canada
Participants (alphabetical): Christopher Antila (nCoda), Don Byrd (Indiana University), Tom Collins (interests in music psychology, music computation, music querying), Jim DeLaHunt (Keyboard Philharmonic), Andrew Horowitz (Verovio), Tom Nauman (Musicnotes).
Note-taker: Jim DeLaHunt

Note: as with any brainstorming exercise, I don’t claim our list or our discussion is definitive. This is just a record of one discussion, for others to build on.

Brainstormed user stories

We brainstormed requirements for a score comparison tool. We used the structure of user stories from agile software development: “as a <role>, I want to <task>, so that I can <benefit>”.  I list the user stories here, with some cleanup and editing. They are roughly grouped by user role.

  1. As a composer, I want to compare alternative versions of a section, so that I can choose one
  2. As a composer, I want to compare variations of a section, so that I can decide in what order to include them in the work.
  3. As an editor of a digital score, I want to compare fragments of the score encoded in parallel, so that I can have confidence that they are accurate.
  4. As an editor of a digital score, I want to compare my draft score to another score of nominally the same work, checking notes but ignoring layout differences, to see if I have errors.
  5. As an editor, I want to compare a version of a score produced by Optical Music Recognition (OMR), to an original, in order to look for mistakes.
  6. As an editor of a particular edition of a score, I want to compare my edition to previous editions, to review decisions made in own edition and in other editions.
  7. As a digital edition curator, I want to audit changes from my contributors, so that I can guide editorial decisions.
  8. As a publisher, I want to review changes made by an editor to a score, to evaluate that editor’s efficiency and effectiveness.
  9. As a digital publisher and format convertor, I want to ensure that our outputs match the submission, so that our product is accurate.
  10. As a publisher, I want to accept submissions from individuals and ensure they aren’t copies of existing work, so I don’t get sued.
  11. As a curator of an online database where users upload content, I want to verify that user claims of having created — not appropriated — that content, are true, so that I don’t get sued.
  12. As a musician looking for works to perform, I want to search by theme in a database of scores, so that I can get a score to perform.
  13. As a musician, I want to compare my edited and annotated score to the original score, so that I can share a patch file containing just my annotations and edits.
  14. As a musician, I want to apply a patch file received from another musician to my edition of the score, so that I can benefit from the other musician’s edits and annotations.
  15. As a musician, I want to pick an alternative set of lyrics and drop them into my vocal score, in order to have a score better to learn and perform from. [Comment: this is more of a “patch” usage than a “diff” case.]
  16. As a musician, I want a “github” for musicians, so that I can share and receive edits and annotations from other musicians.
  17. As a performer, I want to update an annotated score by replacing the underlying score with an improved version, preserving my annotations.
  18. As a performer, I want to compare two editions of the same work, in order to learn what each score has to tell me.
  19. As a performer, I want to migrate my performance annotations from one edition of a score to another way.
  20. As a music analyst, I want to compare similar but different passages of music, so that I can understand similarities and differences.
  21. As a musicologist, I want to use a search function in order to discover instances of imitation, in order to write papers.
  22. As a software developer making a notation format conversion tool, I want to find the smallest difference between an existing conversion and new conversion, in order to make a minimal statement of what has changed.

Observations and discussion

In the course of the discussion, various people made observations about these user stories and requirements. Here is a heavily edited summary.

We used the terms “diff” and “patch” quite a bit. “Diff” refers to a class of text comparison tools. They compare one text file with another, and generate a third text file, which is a concise statement of the differences between the two input files. This statement of differences is often called a “patch file”. “Patch” refers to a class of complementary tools. They takes the first text file, and the patch file, and generate a second file which has identical content to the original second file. The combination of diff and patch are very widely used in collaborative software development. A good introduction is the blog post, The Ten Minute Guide to diff and patch, by Stephen Jungels.

We are coming up with uses for a difference detector, and also for an anti-difference detector, i.e., detecting what was unchanged.

There are facets of music notation: Logical, Analytical, Visual, Gestural, Angle Brackets (XML elements), and Characters. (This model is based on the Standard Music Description Language (SMDL), ISO/IEC DIS 10743.) In terms of these facets, Angle Bracket or XML element comparison is similar to several existing tools, many of which which go by names like “xmldiff”. Characters comparison is similar to the diff utility for text files.

There may be parameters to the comparison, e.g. ignore or consider page breaks, ignore or consider system breaks, etc.

A good packaging for code that compares digital scores is as a library, with a command line interface. The library should be able to be incorporated into other projects.

It seems reasonable to limit the task to comparing like formats. e.g. Compare MEI to MEI, and MusicXML to MusicXML. Maybe there is a different library instance for comparing each notation format.

Concept: “paving the cow paths”. That is, build tools that reflect what people actually do. This argues for iterative, agile development, with lots of user testing right from the start.

Concept: search as a related operation to comparison. That is,  search and comparison both use the underlying operation of collation (meaning, given two entities, is the first less than, equal to, or greater than the second). Maybe there’s a benefit to having a collation module as input to comparison engines and search engines.

Observation: historically, some editors of print music scores, e.g. in 19th century editions, felt very empowered to make changes in scores. They might not have been as concerned as we to be sure that two scores of the same piece were identical in all facets.

Follow-up, and next steps

Tom Collins has references to existing ideas and source code about what kinds of comparison might be interesting. In Matlab and in Javascript.

The Lychee project (part of nCoda) will publish something initial in August [2016]. It will be in the Python language.

Jim DeLaHunt will send these notes to MEI-L, the MEI community email list.