Copying, distributing and printing of content with author permition only!
All rights reserved®. Ben-Gurion University of the Negev.
Abstract.
This project is
concerned with the development of algorithm for processing of the data obtained
by scanning of musical scores. Music score sheets, scanned by an optical
scanner, are stored in the computer memory in an image form. Picture data is
processed so as to determine the difference between various kinds of music
scores (such as notes with various time durations, symbols like sharps and
flats, etc.), recognize them and present them in a MIDI form. This enables the
processed musical scores to be played and lets the user get the feeling of the
musical composition. The MIDI (Musical Instrument Digital Interface) enables
synthesizers, home computers or any electronic musical instruments to be
interconnected through a standard interface for the arrangement or editing of
musical composition. This way the user receives full assistance to cope with a
musical composition and arrangement by himself.
Data processing would be
carried out on an IBM PC or compatible computer,utilizing the C++ programming
language in the WINDOWS environment. There will be a set of methods presented
for the development of the algorithm for data processing of the scanned picture;
the advantages and defects of each will be analysed. In the end, an algorithm
will be developed using the best method which will be chosen in accordance with
its capability to process the data with minimum errors. The algorithm will
contain classification system which will remove noise ( parts of the image that
don't relate to music scores, etc.), which appears during printing or scanning.
The project will
be developed in the WINDOWS environment, in a form most convenient for the user.
The program is mainly intended for music-lovers, who posses the basic ideas of
musical grammar, who can play, but encounter difficulties in the parallel notes
reading. The program is intended for use in music schools and at home. It
requires a PC computer, scanner, and a sound board or an electronic musical
instrument with MIDI interface.
Steps of the work.
First step was
"rectification" of the image, that is, it was necessary to ensure that the stave
lines became strictly horizontal. Usually it is hard to scan in such a way that
scanned picture turned out to be exactly horizontally oriented. It is possible
to have the case, in which lines are not strictly horizontal and it is hard to
detect with the naked eye. However, the difference between the coordinates of
the left and the right edge of the line of only +/- 3 pixels can lead to serious
problems in the subsequent processing.
The next step was the
recognition of the location of the stave lines and removal of them without
affecting all the other detail, which may even be geometrically connected to the
lines. As it is known, the lines are intended for assisting in the recognition
of the pitch of the sound (C, D, E, and so on) as well as the octave, in which
the sound is positioned. As a consequence of the fact that our task is the
recognition of notes (each separately), the overwhelming majority of which are
connected to the stave lines, then it is necessary to erase those lines (while,
naturally, storing their coordinates).
The third stage of the
preliminary processing is the so-called Component Labeling and the building of
the file, in which every object is represented as a Run Length Code containing
the information about the object's characteristics: circumscribing rectangle,
centter of mass, area, heigh to width ratio, and so forth.
It is necessary to point out
that after these three stages are covered one can start classifying objects. The
reason is that while getting rid of staves we are also scaling the picture. The
order of coding is also important: from left to right, going downwards. This
allows us to use rules of the musical grammar in the process of classifying. For
example: we can, by orienting on tact size, expect at the end of tact either a
pause or one eighth note (Look at the examle). After we examine at the
statistics for that figure we will see what will really follow.
Recoding of the information
as specific vectors that contain everything we need to form MIDI code is being
done at the same time as classifying of the objects. Our final objective is to
play the MIDI file we have created with the help of a MIDI interface and a sound
blaster.
Well, now you can
look at examples.