When looking at the wikipedia article on the Perceptron supervised binary classifier. That one about machine learning. I just realized that the binary classifier defined as.

f (x) = { 1 if dotp(w,x) + b > 0

0 otherwise }

To me this just looks like fractional voting. Summation of a list of for and against values.

From this I can possibly come up with a new guess of a classifier.

My target for this is one that “converges” quickly or not at all

I was thinking that maybe I could have a list of previously calculated values of f (x) as the voting samples. Then use this to ”time” series predict where its going to end up. Either 0 or 1 in the time range you are pretty certain of.

My guess is that this puts a boundary value for the weights to adapt to. Within some “time” period.

This way you should be able to reduce the number of weights you need to calculate in the ?hidden layers. The weights need not converge to a situation where the its unnecessary certain what the outcome will be.