Not sure of networks learning the pre-processing as most of the image pipelines I have seen try a lot of hit-and-miss steps with regards to pre-processing. I have seen people try thresholding images, use gradient or edge images, use RGB vs grayscale vs HSL. I think there is a lot of variability in pre-processing which makes it difficult for a network to learn. This is one case where having knowledge of your specific set and some knowledge of computer vision helps, otherwise, we will require a very large number of training images. If we have a small number of images, we can 'augment' the dataset by using image data generators which slightly change images by rotating/resizing/blurring/distorting,etc.
There are also LSTM networks which have a concept of memory but they are used more for speech recognition and time series. I haven't worked with these yet.
Got LSTM on my To-Do list, already. Definately going to check them out!
We're a place where coders share, stay up-to-date and grow their careers.
We strive for transparency and don't collect excess data.