The idea is simple.
Starfish has some kind of eyes on their arms. From this I wonder if it could mix data from feel sensors and light sensors. If the model mixes the data the starfish could somewhat indirectly see in the dark with its feel sensors.
Then are we sure we made an all visual recognition machine learning model?
I speculate that if the output of layer_n to layer_n+1 is not visually recognizable it could easily be fooled. I mean if the output looks like noise its possible to fool the model down the line.