This project is a full implementation of this article.
In this project I implemented a radial basis function neural network with one hidden layer from scratch.
Available features are Number of pregnancies, Plasma glucose concentration, Diastolic blood pressure, Triceps skin fold thickness, 2-H serum insulin, Body mass index, Diabetes pedigree function and Age.
The weights of hidden layer is calculated through interpolation rather than using gradient descent algorithm.
The neurons in this network can be represented as clusters of data points. The article suggested to do clustering on our dataset using K-Means; However, it is essential to know the label of each centroid because of the interpolation process. So I used another algorithm called PAM and also tested its modified versions to do clustering faster. In contrast to K-Means, the centroid is chosen out of data points and it really exists!! The Faster PAM algorithm was proved to be 93 times faster in comparison to regular PAM in my experiments which is awesome!!!
The phi matrix was normalized like below in order to have more coverage over the multi-dimensional space:
The interpolation process is done after clustering and calculation of phi matrix(and of course knowing Z which is the labels of the centroids):
With this method weights are calculated all at once thanks to linear algebra.
I could get these results in the best scenario after applying PCA algorithm which are approximately same as the article:
Accuracy---82.75
Precision--81.32
Recall-----78.4
FPR-------8.97
Implementing RBF neural network from scratch with python
Using Faster PAM algorithm for clustering instead of K-Means
Apllying PCA to reduce dimensionality of data
Utilizing covariance matrix as a normalizer in phi calculations