Please see the links below. You should be able to use these tutorials to get an idea of how the 2D FFT can be used to quickly identify objects.
I don't expect you to be able to do the double integrals, but instead use an engineering approach to these VAB blocks. Can you find a way to use a portion of a scanned objects' FFT and then compare it to a known object FFT (we'll control size and brightness parameters). The question to be answered then is:
1. How many frequencies in a 2D FFT are required to accurately identify an object that the camera is seeing? How robust is this method?
2. Can multiple FFT signatures from multiple objects within a single scan frame allow multiple recognitions?
> The Beobot team uses HST (hue, saturation, value) in order to identify salient features.