6.2.2.  Classification

The classification error for the above 30 input map is shown in figure 23b
. Classification errors for grids of other sizes can also be seen here. The 9 by 9 grid seems to give the best result (31.60%) although this is much higher than that obtained for the 60:25:5 feed forward network (25.61%). It seemed wise to also investigate SOMs of 60 inputs and the results are shown in the same graph.
With the 60 input SOM the classification error is better for the larger grids (10*10 and 12*12) but the overall error is still higher than that obtained for the 30 input map. Either way neither of these grids are in the order of  accuracy in terms of classification of the 60 input feed forward MLP, so extensive investigation was not considered justifiable.
Figure 23a
shows the effects of changing the initial neighbourhood width. The best value is 7 (compared with the length of the side of the grid 9). When the initial neighbourhood width is the size as the grid length, classification error increases to 35%. This can be attributed to the neighbourhood function inhibiting clustering due to increased width.

6.3. Analysis of individual drum type classification


For the case of the classification error graph in figure 19
(variable number of input nodes, 25 hidden nodes), the following table is a breakdown of the classification error of individual drum types.
Kicks, exhibiting the largest amount of low frequency spectral energy, are more easily identifiable. This should be the same for high hats which have a similar amount of high frequency energy. The greater error in hat classification is probably due to the average lower amplitude at which they appear, giving rise to greater noise possibilities.


The error rate for congas is relatively low, but this can be attributed to the fact that congas mainly appear in the test and training data within rhythms explicitly using that drum type (a rhythm simply featuring perhaps two conga drums). Hence noise interference from other kit sounds will be low.
Snares are slightly higher in error rate than congas, but conversely they almost always appear within the noise of a full drum kit.
Lastly there is the much larger error in the tom drums. Toms span the greatest pitch range of all the drum types here, and very often have a large amount of low spectral energy. Thus in many instances they have probably been classified as kicks. In addition toms of higher pitches that appear as tom/hat combinations will sound very similar to snare drums (a snare is like a tom with an added amount of higher frequency energy, similar to a tom/hat combination).

6.4. Post classification

Due to the time constraints of the project, a thorough investigation of the post classification algorithms was not possible. This is mainly because of the time it takes to manually listen to each break in the training set and label all the soft hits and pitch groups.