-
Notifications
You must be signed in to change notification settings - Fork 248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simple quality measurement tool #122
base: development
Are you sure you want to change the base?
Conversation
This compares the notes in two MusicXML files and calculates a simple score, correlated with recognition quality. Python and music21 are required.
Thanks Alexander. We are more and more aware of the need for such regression management, but failed to provide one for lack of high-level measurement tool. |
Thanks :)
That's true. As a hobbyist programmer, I'm doing all this work in my spare time. |
This regression test looks good to me. Since this project is in Java, it will be more natural to JUnit. Happy to do it if you are interested. |
Of course this would be interesting for the project. |
As provided in PR Audiveris#122 by Alexander Myltsev (@avm).
Hello everybody! Thanks @avm for this nice contribution! I think tracking recognition quality over time/commits would indeed be very valuable 😃 I had a few spare hours and felt like I could contribute to this awesome project by providing the equivalent implementation in Java/Junit. @guillaumerose I hope you hadn't started yet - at least I could not find a branch or commits on your Audiveris fork. For clarity, I created a separate PR: #563 Cheers |
In order to make Audiveris better, we need some way to measure how good it already is. This is an attempt to actually measure the recognition quality with the simplest possible metric: number of notes and rests correctly placed in the recognition result.
This requires Python 3 with the music21 library installed. One sample test case is included (actually taken from issue #78), along with its "ideal" recognition result.
The proposed workflow is this:
test/cases/
(source.png
orsource.pdf
andtarget.xml
);test$ ./diffscore.py -c cases
;The quality score calculation is quite simplistic right now, just checking pitch and duration for the notes and rests. Later on, within the same framework and using existing test cases, we can start taking into account more stuff (keys, time signatures, measure durations, dynamics, etc).