[HN Gopher] Differentiable Signed Distance Function Rendering
       ___________________________________________________________________
        
       Differentiable Signed Distance Function Rendering
        
       Author : lnyan
       Score  : 142 points
       Date   : 2022-05-09 13:58 UTC (9 hours ago)
        
 (HTM) web link (rgl.epfl.ch)
 (TXT) w3m dump (rgl.epfl.ch)
        
       | [deleted]
        
       | sydthrowaway wrote:
       | Is this a gamechanger?
        
         | SemanticStrengh wrote:
         | big if true
        
           | coolspot wrote:
           | Or as an old oracle put it once:
           | 
           | IF true THEN 'big' END IF
        
       | thechao wrote:
       | This reminds me of phase-problem from protein crystallograpy
       | (PX). I'm 15+ years away from that space, so I have no idea what
       | the modern solution space looks like, but it seems like there's a
       | corollary between reverse-AD-over-SDFs (in this work) and the MLE
       | methods for inverse space in PX. It _feels like_ we should be
       | able to take an initial estimate of some SDF over the direct
       | space (2d object) as an ellipsoid, flip to inverse space, do an
       | MLE for phase improvement, and then just do the regular ping-pong
       | on that. The MLE (and successor?) methods are _really_ robust.
        
         | tfgg wrote:
         | The best thing to do these days is use an AlphaFold prediction,
         | if there exists a confident one, as a starting position for
         | refinement.
         | 
         | [1] https://pubmed.ncbi.nlm.nih.gov/35362474/
        
       | IshKebab wrote:
       | Very cool. Presumably the advantage of this is that it can solve
       | big flat areas like the back of the chair that traditional
       | methods never work well with.
       | 
       | But am I understanding correctly that it needs known lighting
       | conditions? Presumably that's why they don't demo it on real
       | images...
        
       | SemanticStrengh wrote:
       | I wonder what pcwalton and raphlinus thinks about this.
       | Pathfinder was a SDF SVG renderer after all.
        
         | raphlinus wrote:
         | I am fairly excited about these techniques. I consider SDFs to
         | be a very powerful representation of 2D scenes, and of course
         | Inigo Quilez has been demonstrating the incredible power of
         | SDFs in 3D. Of course, the big challenge is, what tools do you
         | use to _author_ content in SDF form? In Inigo 's case, it's
         | essentially a text editor, coming up with the math yourself,
         | but that doesn't scale unless you're a genius like him.
         | 
         | So using machine learning and optimization techniques in
         | general to solve inverse problems, so your input is something
         | meaningful, is a way to unlock SDF rendering without requiring
         | a reinvention of the whole tool universe.
         | 
         | If someone wants a modest-scope project, it would be applying
         | these kinds of techniques to blurred rounded rectangle
         | rendering[1]. There, I got the math pretty close, fiddling with
         | it by hand, but I strongly suspect it would be possible to get
         | even closer, using real optimization techniques.
         | 
         | [1] https://raphlinus.github.io/graphics/2020/04/21/blurred-
         | roun...
        
           | gfodor wrote:
           | I think Dreams on PS4 kind of answers this question clearly,
           | at least one answer to it.
           | 
           | I do think SDFs are a good path for a great potential
           | breakthrough for AI directed 3d modelling tools like DALL-E
           | -- we're pretty close.
        
             | raphlinus wrote:
             | Yes, Dreams is also strong evidence that SDF-based
             | rendering is viable. That project also put an enormous
             | amount of effort into creating new design tools :)
        
       | arduinomancer wrote:
       | I've noticed a lot of interest in differential rendering/reverse
       | rendering recently
       | 
       | Does anyone know what the end goal of this kind of research is or
       | why there is so much interest?
       | 
       | Its definitely cool but is the idea just to make photogrammetry
       | cheaper/easier?
       | 
       | Or are there other use cases I'm missing
        
         | berkut wrote:
         | There are other things it's been used for like optimising
         | roughening/regularisation parameters for light transport in
         | order to provide better caustics with forwards path tracing:
         | 
         | https://diglib.eg.org/handle/10.1111/cgf14347
        
         | cperciva wrote:
         | One major use case for "convert a bunch of photos into a model"
         | is 3D printing. Often you have an object -- maybe a part which
         | broke and needs to be replaced, maybe a tooth surface which
         | you're making a crown for -- and you need to turn the object
         | into a model and then create a new object.
        
         | soylentgraham wrote:
         | I see it as a (potential) storage format. As we move away from
         | triangles and towards ray/path/etc tracing (in whatever forms)
         | we want to still have content (be it captured or crafted) in
         | it, that we can render with things like "infinite" details,
         | bounces, and other meta (reflectancr etc).
         | 
         | All current forms of triangulation/meshing from fractcals, sdf,
         | point clouds, etc, are relatively terrible and constrained, to
         | fit existing pipelines (hence why most AR is just lackluster
         | model viewers)
         | 
         | Make it differentiable is a step towards cramming it into a
         | small model and output other forms, or trace directly, or just
         | compress before say, extracting to some Tree-renderable format
         | on the gpu
        
         | gfodor wrote:
         | There's an _incredibly_ interesting track I 've been following
         | trying to find a good latent embedding for AI to generate 3d
         | objects, and SDFs seem like a very good representational format
         | for this. This would, among other things, allow for a DALL-E 2
         | like generation framework for 3d. Google DeepSDF.
        
       | fxtentacle wrote:
       | This is basically Nerf but without support for transparent /
       | translucent / subsurface / glossy / refractive objects.
        
         | natly wrote:
         | I mean this is way cooler than nerfs in my opinion. This is
         | just ray tracing (a fairly tractable rendering solution on
         | todays GPUs - whereas nerfs are usually huge on the size and
         | compute fronts) except here the whole pipeline from object to
         | image is differentiable which means you could use it to feed an
         | image in and generate a plausible underlying mesh for it.
         | (Which you can then apply regular physics simulation to or
         | whatever which would have been a pain/impossible with nerfs.)
        
         | gfodor wrote:
         | SDFs do not dictate how you approximate the rendering equation
         | beyond the fact that they are a useful way to describe
         | surfaces. (So, for example, they are mismatched with volumetric
         | rendering.) The characteristics you cite are generally
         | satisfied through the BSSRDF or other material models in
         | conjunction with surface intersection tests, not by raymarching
         | through scattering material. NeRFs solve the entire problem
         | differently by creating a radiance sampling oracle in the form
         | of a neural network. So in other words, yes, you can certainly
         | use SDFs to render those physical effects you mention. Claiming
         | these are comparable to NeRF is pretty misleading, anyway,
         | because NeRF is modeling a scene in a fundamentally different
         | way than surface + material representations, with (other)
         | distinct tradeoffs between the two methods.
        
       | natly wrote:
       | It's funny how this totally could have been implemented in like
       | the late 90s except I guess no one did and yet we now see how
       | it's totally able to solve lots of 'inverse problems' that 3d
       | reconstruction algorithms have tried to solve for decades.
        
       | WithinReason wrote:
       | Why use SDFs instead of NeRFs? Those were designed to be
       | differentiable. Then you could turn the NeRF to an SDF later.
       | Related: https://nvlabs.github.io/instant-ngp/
        
         | hansworst wrote:
         | Nerfs don't contain a surface representation, but instead
         | contain a occupancy field to represent the shape. This has
         | implications for rendering speed (you generally need more
         | samples along each ray to get an accurate representation). It
         | also has implications for uses outside of rendering, e.g. in
         | physics simulations where you need to know exactly where the
         | surface is. Lastly, you can use SDFs to get arbitrarily
         | detailed meshes by sampling the field at some resolution, who
         | which is trickier with nerfs because the surface is more fuzzy.
        
           | WithinReason wrote:
           | In the linked video presentation [1] they automatically turn
           | the NeRF into a mesh, so you could do the same and get faster
           | rendering speed than SDFs. I wonder if simply enforcing less
           | transparency towards the end of the training in a trained
           | NeRF would help with converging to sharper object boundaries
           | and would get the benefits of fast training without the
           | downsides. The linked paper doesn't even discuss training
           | times which suggests it's really bad.
           | 
           | NeRFs a are also direction-dependent so work with specular
           | surfaces too.
           | 
           | [1]: https://nvlabs.github.io/instant-
           | ngp/assets/mueller2022insta...
           | 
           | (I edited my post so maybe you replied to an earlier version)
        
         | yolo69420 wrote:
         | It's useful to have differentiable versions of all sorts of
         | rendering techniques. They can be used as input to training for
         | whatever you want to do.
        
         | dahart wrote:
         | Why use NeRFs? An important question is: what is the
         | difference? The instant NGP learns an SDF, so it isn't an
         | either/or question, right? A couple of people have mentioned to
         | me that the Berkeley paper doesn't require a neural network,
         | that you can use an optimizer on the density field instead. I
         | don't fully grasp the exact difference between those things in
         | detail, having not implemented it myself, but using a direct
         | optimizer without a neural network seems like an important
         | conceptual distinction to make and worthy of research, doesn't
         | it? We _should_ distill the process to its essence, break down
         | the pieces, remove complexity and dependencies until we
         | understand the utility and effectiveness of all the parts,
         | shouldn't we?
         | 
         | Possibly relevant for this paper is also the fact that Nerfs
         | are less than 2 years old, while SDFs in rendering are almost
         | 30 years old, and maybe in practice more than that.
        
           | WithinReason wrote:
           | > Why use NeRFs?
           | 
           | Can handle view dependence, can handle transparency, probably
           | faster to train (this paper doesn't mention training speed
           | while the Nvidia one makes a big point about performance),
           | probably higher resolution (comparing final outputs)
        
             | dahart wrote:
             | It's important to address the question of what, exactly, is
             | the difference between whatever two things you're comparing
             | here.
             | 
             | There's little reason to believe that optimizing an SDF and
             | training a NeRF are any different in terms of optimization
             | speed or resolution, those two processes are really more
             | like different words used to describe the same thing.
             | Training a neural network _is_ an optimization. And NGPs
             | aren't just a neural network - it also has an explicit
             | field representation.
             | 
             | At this point, neural fields aren't well defined. The
             | Berkeley NeRFs and Nvidia NGPs are two different things in
             | terms of what the NN lears to infer - one is density the
             | other is SDF. And Yes, these two NN papers are learning
             | material properties in addition to the volumetric
             | representation, while the paper here is learning purely an
             | SDF. That's simply asking a different question, it's not a
             | matter of better or worse. The advantages depend on your
             | goals. If all you want is the geometry, then material
             | properties aren't an advantage, and could add unnecessary
             | complication, right?
        
               | WithinReason wrote:
               | >There's little reason to believe that optimizing an SDF
               | and training a NeRF are any different in terms of
               | optimization speed or resolution
               | 
               | A trainable SDF representation could very well be slower
               | to train than a trainable NeRF representation
               | 
               | > If all you want is the geometry, then material
               | properties aren't an advantage, and could add unnecessary
               | complication, right?
               | 
               | Unless the SDF's inability to model view dependence would
               | interfere with its ability to minimise its loss
        
               | dahart wrote:
               | > A trainable SDF representation could very well be
               | slower to train than a trainable NeRF representation
               | 
               | Sure. It could very well be faster too (or the same-ish,
               | if it turns out they're more fundamentally the same than
               | different). Carrying view dependent data around is more
               | bandwidth, potentially significantly more, depending on
               | how you model it. How you model it is under development,
               | and a critical part of the question here.
               | 
               | This all depends on a whole bunch of details that are not
               | settled and can have many implementations. There is
               | _significant_ overlap between the ideas in the Nvidia
               | paper and the EPFL paper, and it's worth being a bit more
               | careful about defining _exactly_ what it is we're talking
               | about. It's easy to say one might be faster. It's harder
               | to identify the core concepts and talk about what
               | properties are intrinsic to these ideas over a wide
               | variety of implementations, and how they actually differ
               | at their core.
        
               | jimmySixDOF wrote:
               | >The advantages depend on your goals.
               | 
               | Too true. For example if NeRF has any advantage to this
               | application I will be pleasantly surprised [1] [Text
               | rendering using multi channel signed distance fields]
               | 
               | [1] https://news.ycombinator.com/item?id=20020664
        
         | gfodor wrote:
         | A couple of reasons:
         | 
         | - SDFs are much more amenable to direct manipulation by humans
         | 
         | - SDFs potentially can be decomposed into more human-
         | understandable components
         | 
         | - SDFs may provide a better mechanism for constructing a
         | learned latent space for objects (see eg DeepSDF)
         | 
         | - Some rendering engine targets may play more nicely with SDFS
        
       | im3w1l wrote:
       | What is the problem they are solving? My best guess is: A 3d
       | model is rendered and overlaid a number of background images.
       | From these compositions, reconstruct the original model. Is that
       | it?
        
         | WithinReason wrote:
         | Ultimately, to reconstruct any object from photos. Here they
         | only test on a synthetic scene.
        
           | [deleted]
        
           | im3w1l wrote:
           | How does it know what is fore- and background?
        
       ___________________________________________________________________
       (page generated 2022-05-09 23:00 UTC)