next up previous
Next: Creating scatter-partitioning fuzzy systems Up: Incremental neuro-fuzzy systems Previous: Equivalence of fuzzy systems

Radial basis function networks according to Moody and Darken

Moody and Darken[3] have proposed a multi-phase approach to RBFNs. First, a pre-defined number of centers is distributed in input space with a cluster method (e.g., LBG[4] or k-means[5]). The width parameters $\sigma$ of the Gaussians are set by a local heuristic, e.g., setting the $\sigma$ of each unit equal to the distance to the nearest other units. Moreover, Moody and Darken propose to use normalized activations according to (3). In terms of fuzzy systems the steps just described correspond to the identification of the IF-parts of a Sugeno fuzzy rule.

The THEN-parts of the fuzzy rules or, alternatively, the output weights of the RBFN, are set by pseudo-inverse computation such that the summed square error (4) for a given training data set is minimized. It is also possible, but has usually no advantage, to compute the output weights iteratively through gradient descent on the error function[6].

This multi-phase approach is straight-forward and is often reported to be much faster than, e.g., the backpropagation training of multi-layer perceptrons for the same data. A possible problem of the approach, which has for example been noted by Bishop[7], is that the clustering is completely unsupervised and does not take the given desired output information (class labels or continuous output values) into account. Clustering methods usually try to minimize the mean distance between the centers they distribute and the given data (which is only the input part of the training data). This error, however, is of little relevance to many supervised learning problems. The resulting distribution of RBF centers (or rule patches) may, therefore, be poor for the classification or regression problem at hand. Fig. 6 shows an example where this is the case.


 
Figure 6:   Classification with a scatter-partitioning fuzzy system (with normalized MFs) constructed by an RBFN according to Moody and Darken. The training data set is that of Fig. 4a. a) Fuzzy system with 20 rules positioned with the LBG clustering algorithm. b) Estimated a posteriori probability that the data belongs to class A. c) Classification result. Since classes in the training data seem to have very little overlap, it would seem reasonable to require that the fuzzy system maps nearly all training data points to the correct class. However, the distribution of rule centers found by LBG contains only 3 rules in the upper right corner of the shown part of the input space and 4 would be necessary in this case. Since LBG is initialized at random from the data set, one can expect a distribution of the centers proportional to the probability density. Considering this, the fraction of 3/20=0.15 positioned in the upper right is already more than the value one can expect looking at the given data, namely 4/36=0.11. The underlying problem is, that LBG does not take the desired outputs for the data into account.
\begin{figure}

 \begin{minipage}[b]
{0.28\textwidth}
 \mysepsf{fig/spierbf1.ps}...
 ...width}
 \mysepsf{fig/spierbf5.ps}
 \centerline{c)}
 \end{minipage}
 \end{figure}

If one visually analyzes the generated neuro-fuzzy system in Fig. 6, it becomes obvious that at several places there are rules covering neighboring areas and having basically the same output. Such rules could be combined into fewer rules each covering a correspondingly larger area of the input space. This would set free resources which could be used in places where the system can benefit from them more, for example in the upper right corner of the displayed part of the input space.

Instead of first constructing a possibly poor neuro-fuzzy system and then improving it later on, it had some advantages if one could immediately build a good system for the problem at hand. This is the goal of the method in the following section.


next up previous
Next: Creating scatter-partitioning fuzzy systems Up: Incremental neuro-fuzzy systems Previous: Equivalence of fuzzy systems

Bernd Fritzke
10/21/1997