A quick guess.
I wonder. What would happen if you add rules for the weights in machine learning matrices inspired by molecules and atoms. I suspect it could get some clustering and pattern formation going. With large hidden layers I think you get some more flexibility for connected rules that make the weight matrix layer look like some kind of image.
The idea is to study these formations. Like an engineer would study materials or metals.
Could you then use this for direct weight matrix creation. That is with minimal training.
Another idea would be to create a network that does not need to do memorization. The idea is simple. For recognition of objects in pictures. At the classfier.predict() function you input some known images also. So if the network are about to learn letter fonts then the network should be able to take advantage of images already availible and labeled correctly. This way the network does not need to store memorization information in its weight matrices.
I imagine one advantage of a network like this is that you dont need to retrain it. Just add images that have correct labels.
So you got smartClassfier.predict( imageToPredict, imageDataThatHaveCorrectLabels )