[HN Gopher] Class-specific diffractive cameras with all-optical ... ___________________________________________________________________ Class-specific diffractive cameras with all-optical erasure of undesired objects Author : rntn Score : 40 points Date : 2022-08-15 12:59 UTC (1 days ago) (HTM) web link (elight.springeropen.com) (TXT) w3m dump (elight.springeropen.com) | rush-mindwork wrote: | Greg_hamel wrote: | This is pure physics gold. | | And the fact that the paper is available for free is just added | gravy! | elil17 wrote: | If you think about it in the abstract, it's not that weird. Okay, | you're computing a function with light using some diffraction | gradients. | | The outcome, though, is mind-boggling: a camera that can only | take pictures of the number two, no other numbers. Totally | magical! | api wrote: | Can you compute a neural network this way? Or do other forms of | useful computation? | Enginerrrd wrote: | If I understand correctly that's sort of exactly what this | is. The geometry of the diffraction gratings encodes a | forward propagation model trained as classifier of the number | "2". | | I don't quite understand the mathematics of how it was | trained, but they were able to discretize the geometry of | those layers somehow into little 0.4mm pixels of "trainable | diffractive neurons" and they simulated light transmission | through the layers to compute a loss function. | | I'm really surprised that this was computationally feasible. | Simulation of light through the gratings must have been cheap | enough as a function evaluation to train the network. | Lramseyer wrote: | I would imagine that you generate the desired transform | function of a diffractive structure rather than the | structure itself, because the structure is ultimately | derived from the transform function. Since the transform | function is basically a 2D fourier transform and a spatial | frequency/phase plot, it's not _that_ computationally | costly. Once you settle on functions you like, you then | generate and or simulate a diffractive structure and see if | it behaves how you expect. | Lramseyer wrote: | Sort of, I didn't dive too far into the math, but it looks | like each diffractive structure is akin a layer of a neural | net, which is tuned for a set of spatial frequencies and | phases, which when combined (like layers of a neural net) to | form recognition of more complex objects. | | There are a few gotchas in that statement though - for one, I | didn't dive too far into the math, and I would assume that | the convolutional algorithms as well as the underlying matrix | functions may be different. But at the end of the day, you're | approximating a complex function using an array of simple | functions with different weights and scale factors. The other | gotcha is that diffractive structures use monochromatic | light, so it's probably not too useful in most normal | situations with normal light sources. | SiempreViernes wrote: | It can do the same sort of computation work any stack of | analogue filters can do: it does _one thing_ very fast and if | you want something else done you must create those filters | first and the frame holding the stack is of no help at all. | stavros wrote: | > Okay, you're computing a function with light using some | diffraction gradients. | | Our definitions of "not that weird" are very different. | sbaiddn wrote: | TL/DR, but far field diffraction is the Fourier transform of | the aperture (the math is straightforward enough, an integral | of an exp). | | It blew my mind when I did in school, yet... there was the | proof that it worked! | SiempreViernes wrote: | > a camera that can only take pictures of the number two, no | other numbers. | | Well, to be precise it makes a complete (deterministic) mess of | any other numbers. But given the output and the filters you can | probably unfold the camera "psf" and get back whatever it was | it saw. | dplavery92 wrote: | From the parent article: | | >Importantly, this diffractive camera is not based on a | standard point-spread function-based pixel-to-pixel mapping | between the input and output FOVs, and therefore, it does not | automatically result in signals within the output FOV for the | transmitting input pixels that statistically overlap with the | objects from the target data class. For example, the | handwritten digits '3' and '8' in Fig. 2c were completely | erased at the output FOV, regardless of the considerable | amount of common (transmitting) pixels that they | statistically share with the handwritten digit '2'. Instead | of developing a spatially-invariant point-spread function, | our designed diffractive camera statistically learned the | characteristic optical modes possessed by different training | examples, to converge as an optical mode filter, where the | main modes that represent the target class of objects can | pass through with minimum distortion of their relative phase | and amplitude profiles, whereas the spatial information | carried by the characteristic optical modes of the other data | classes were scattered out. | | It seems like that may not be so possible. | | Later on in the article: | | >It is important to emphasize that the presented diffractive | camera system does not possess a traditional, spatially- | invariant point-spread function. A trained diffractive camera | system performs a learned, complex-valued linear | transformation between the input and output fields that | statistically represents the coherent imaging of the input | objects from the target data class. | | Note here that the learned transformations are linear, and | the Fourier Transform is linear, but you cannot invert from | output to input because the sensor measures real-valued | intensities of complex-valued fields. All the phase | information is lost. ___________________________________________________________________ (page generated 2022-08-16 23:00 UTC)