Reply by navjotjsingh is most appropriate.
OCR can be done best in 1-bit colour - black & white. Similarly the mp2/wav file should have only one instrument, not multiple instruments as in an orchestra/musical piece -- see?
The Best Method - A human converter. Get a musician to play the song on the keyboard and record it as a Midi file.