Hello again, everyone:

This is a complement to yesterday's question about "using MusicXML as semantic format". As I mentioned, the project I'm starting will, I intend, result in a number of public-domain music works from the symphonic/opera tradition being transcribed — into artifacts which import "efficiently" into a wide range of notation applications, now and in the future. The goal is that a corpus of works, in this digital format, in the public domain, will be useful for re-engraving into new and useful formats, presentation on electronic screens, as well as for music analysis and other innovative ends.

What do I mean by "efficiently"? That a valid MusicXML notation file, when imported into a notation application, saves the author much of the work required to achieve a high-quality music score, which musicians can read and perform from well.

So, imagine I have a scanned image of a public-domain piano-vocal score of the choral movement of Beethoven's 9th symphony. I take a notation application, say Finale or Noteflight or Mozart or Sibelius or MuseScore or etc.. Starting with an empty score, I enter that work into the notation application, then format it until the resulting score is faithful to the original printed score, and also is beautiful, and can be read and performed well. Measure how long that takes.

Now, imagine a MusicXML file which contains that same score. Import that file into the same notation application, then tweak until the resulting score reaches the same level of matching the original, being beautiful, and usable. Measure how long that takes. It should be a shorter time than starting from scratch. The proportion of the time saved by starting from the MusicXML file, relative to the time starting from scratch, is what I mean by the efficiency of the MusicXML file. It depends on the details of how the MusicXML file, the particular work, and the notation application.

A MusicXML file with a 100% efficiency for a notation application means you simply import the file and you have a faithful, beatiful, usable score. 0% efficiency means a MusicXML file which is no better than starting from scratch. Negative efficiency is possible; a particular file could cause such a mess that it takes longer to clean it up than to throw it out and start from scratch.

When I author a MusicXML file of a work for general re-use, I want the file to be highly efficient for a wide variety of notation applications, both now and in the future. Think of the overall efficiency of the file being the average of that file's efficiency with each notation application (for each of its versions), weighted by how important that application (and version) is to my audience.

Does anyone have any idea what efficiencies are possible with the current crop of notation applications? Is 99% efficiency too much to ask? Do notation applications typically export MusicXML files which are highly efficient when imported back to that same application?

Does anyone have any idea how efficiency varies over time for a single notation application? If I have a MusicXML file which is highly efficient for version X of a application, what's the likelihood it will as efficient for version X+1 of the same application?

Does anyone have any idea what overall efficiencies are possible for a single MusicXML files against the broad range of current notation applications? Is 99% efficiency for the each of the top 5 notation applications, from a single MusicXML file, too much to ask?

And, how do I author highly efficient MusicXML files? Has anyone come up with editorial guidelines which lead an author to make highly efficient files? Which notation applications are best for exporting highly efficient files? Or is it necessary to resort to directly tweaking the MusicXML file to make it highly efficient?

I realise it's easy to author MusicXML files of uncontrolled fidelity and low efficiency. What I'm after is any experience and know-how of what it takes to do better than that. I think this is perhaps a different way to look at some of the issues from the Library of Congress data challenge thread (2014-03-02 14:02).

Thanks in advance for your insight.

have a MusicXML file containing the