Workshop Engineering and Music

"Human Supervision and Control in Engineering and Music"

Dr Bozena Kostek

Expert System for Musical Style Recognition

Abstract

In this overview some concepts concerning sound engineering, computer music and human supervision are presented. Multimodal-computer interactions consist in, among others, collecting and intelligent searching music related-information. Some concepts related to the author’s experience will be presented. Key findings in sound engineering allow recording music in a natural way. Computers can be employed as both Internet sites collecting music-related data and as algorithmic tools that enable musicians to find needed information. They allow analyzing a given melody, modify it in musically sensible ways, mimic the human way of composing, etc. Human supervision is needed at both stages. The quality of recording cannot be assigned otherwise than subjectively. Organizing a computer site containing music-related information needs also the supervising of the future user. Developing artificial intelligence algorithms and designing ergonomic user interfaces is also a task for a human supervisor.

Introduction

Due to the rapid development of multimedia technology, the amount of audio data stored on various computer sites grew significantly. Consequently, a problem is to find methods that allow one to effectively explore a huge collection of data in order to find needed information.

Specifically, the problem is to recognize objects within audio material. An example of the multimodal approach consisting in computer-user interaction will be presented here. This particularly concerns the recognition of a musical phrase or musical style. The starting point of this process is to build up an expert system. It is possible and even suggested to use a collection of data stored at the ftp server containing large database of music encoded in MIDI. In Fig. 1 an algorithm for creating such an expert system is presented, although procedures marked in a dotted line block represent learning tasks not used in the recognition mode. In a training mode human supervising of the classification of score patterns to a particular musical style is necessary. The next step is the decoding of the MIDI code. Such a block provides the core for the feature extraction procedure. In that phase all attributes available in the MIDI code patterns are decoded. The next block in the algorithm denotes the extraction of musical parameters out of pitch and notes duration decoded from the MIDI code. The quantization block is necessary to build up the learning system. For the purpose of this approach the rough set-based classification is used. Quantized values of musical parameters will feed the rough sets-based algorithm as the condition attributes. Several concepts will be derived according to a musical style. As a decision attribute the musical style class number will be chosen. The results of creating rules by the rough sets-based algorithm must be verified by the human supervisor during the learning phase.

Learning Approach to the Musical Style Recognition

Learning approach to the problem of computer-human interactions can be found in the rich literature, a few examples of which are given in References.

Fig. 1. Lay-out of the experimental system for the automatic recognition of musical styles (learning tasks)

Searching for a given musical phrase can be based on another approach, which is presented schematically in Fig. 2. To this end a prediction module is applied. This approach needs some additional steps such as for example pitch extraction stage. Based on the conducted experiments it can be said that a learning approach to musical data analysis is generally justifiable, because the number of possible melodies stored on computer sites is infinite and one should employ artificial intelligence for such tasks.

Fig. 2. Lay-out of the experimental system for the prediction of musical phrases

References

Bello, J.P., Monti, G. & Sandler, M. (2000). Techniques for automatic music transcription, Proc. ISMIR.

Coates, D. (1994). Representations of the MONK Harmonisation Systems, Proc. of Workshop held as part of AI-ED 93, M. Smith, A. Smith, A. Wiggins (Eds.), Edinburgh, Scotland, 25 August 1993, pp. 77-91, Springer Verlag, London.

Desain P. & Honing H. (1991). The Quantization of Musical Time: A Connectionist Approach, Music and Connectionism, P. M. Todd & D. G. Loy (eds.), pp. 150-169, The MIT Press, Cambridge, Massachusetts, London, England.

Holland, S. (1994). Learning About Harmony with Harmony Space: An Overview, Proc. of Workshop held as part of AI-ED 93, M. Smith, A. Smith, A. Wiggins (Eds.), Edinburgh, Scotland, 25 August 1993, pp. 24-40, Springer Verlag, London.

Hörnel, D. (1997). MELONET I: Neural Nets for Inventing Baroque-Style Chorale Variations, in Advances in Neural Information Processing 10 (NIPS 10), Jordan, M.I., Kearns, M.J., Solla, S.A. (Eds.), MIT Press.

Kostek, B. (1998). Computer Based Recognition of Musical Phrases Using The Rough Set Approach, J. Information Sciences, 104, pp. 15-30.

Kostek, B. (1999). Soft Computing in Acoustics, Applications of Neural Networks, Fuzzy Logic and Rough Sets to Musical Acoustics, Studies in Fuzziness and Soft Computing, Physica Verlag, Heildelberg, New York.

Mozer, M.C. (1991). Connectionist Music Composition Based on Melodic, Stylistic, and Psychophysical Constraints, Music and Connectionism, P. M. Todd & D. G. Loy (eds.), pp. 195-211, The MIT Press, Cambridge, Massachusetts, London, England.

Papaodysseus, C., Roussopoulos, G., Fragoulis, D., Panagopoulos TH. & Alexiou C. (2001). A New Approach to the Automatic Recognition of Musical Recordings, J. Audio Eng. Soc., 49 (1/2).

Slade, S. (1991). Case-based reasoning: a research paradigm, Artificial Intelligence Magazine.

Smith, M., Smaill, A. & Wiggins, G.A. (Eds.). (1993). Music Education: An Artificial Intelligence Approach, Proc. of the World Conference on Artificial Intelligence in Education, Edinburgh.

RAA (Recognition and Analysis of Audio. (2000). European project. http://www.iua.upf.es/mtg/raa