[HN Gopher] SMERF: Streamable Memory Efficient Radiance Fields ___________________________________________________________________ SMERF: Streamable Memory Efficient Radiance Fields We built SMERF, a new way for exploring NeRFs in real-time in your web browser. Try it out yourself! Over the last few months, my collaborators and I have put together a new, real-time method that makes NeRF models accessible from smartphones, laptops, and low- power desktops, and we think we've done a pretty stellar job! SMERF, as we like to call it, distills a large, high quality NeRF into a real-time, streaming-ready representation that's easily deployed to devices as small as a smartphone via the web browser. On top of that, our models look great! Compared to other real-time methods, SMERF has higher accuracy than ever before. On large multi-room scenes, SMERF renders are nearly indistinguishable from state-of-the-art offline models like Zip-NeRF and a solid leap ahead of other approaches. The best part: you can try it out yourself! Check out our project website for demos and more. If you have any questions or feedback, don't hesitate to reach out by email (smerf@google.com) or Twitter (@duck). Author : duckworthd Score : 258 points Date : 2023-12-13 19:03 UTC (3 hours ago) (HTM) web link (smerf-3d.github.io) (TXT) w3m dump (smerf-3d.github.io) | sim7c00 wrote: | this looks really amazing. i have a relatively old smartphone | (2019) and its really surprisingly smooth and high fidently. | amazing job! | duckworthd wrote: | Thank you :). I'm glad to hear it! Which model are you using? | sim7c00 wrote: | samsung galaxy 10se | guywithabowtie wrote: | Any plans to release the models ? | duckworthd wrote: | The pretrained models are already available online! Check out | the "demo" section of the website. Your browser is fetching the | model when you run the demo. | ilaksh wrote: | Will the code be released, or an API endpoint? Otherwise it | will be impossible for us to use it for anything.. since it's | Google I assume it will just end up in a black hole like most | of the research.. or five years later some AI researchers | leave and finally create a startup. | zeusk wrote: | Are radiance fields related to Gaussian splattering? | duckworthd wrote: | Gaussian Splatting is heavily inspired by work in radiance | fields (or NeRF) models. They use much of the same technology! | corysama wrote: | Similar inputs, similar outputs, different representation. | aappleby wrote: | Very impressive demo. | duckworthd wrote: | Thank you! | refulgentis wrote: | This is __really__ stunning work, huge, huge, deal that I'm | seeing this in a web browser on my phone. Congratulations! | | When I look at the NYC scene in the highest quality on desktop, | I'm surprised by how low-quality ex. the stuff on the counter and | shelves is. So then I load the lego model, and see that's _very_ | detailed, so it doesn't seem inherent to the method. | | Is it a consequence of input photo quality, or something else? | duckworthd wrote: | > This is __really__ stunning work | | Thank you :) | | > Is it a consequence of input photo quality, or something | else? | | It's more a consequence of spatial resolution: the bigger the | space, the more voxels you need to maintain a fixed resolution | (e.g. 1 mm^3). At some point, we have to give up spatial | resolution to represent larger scenes. | | A second limitation is the teacher model we're distilling. Zip- | NeRF (https://jonbarron.info/zipnerf/) is good, but it's not | _perfect_. SMERF reconstruction quality is upper-bounded by its | Zip-NeRF teacher. | jacoblambda wrote: | Is there a relatively easy way to apply these kinds of techniques | (either NeRFs or gaussian splats) to larger environments even if | it's lower precision? Like say small towns/a few blocks worth of | env. | duckworthd wrote: | In principle, there's no reason you can't fit multiple City | blocks at the same time with Instant NGP on a regular desktop. | The challenge is in estimating the camera and lens parameters | over such a large space. I expect such a reconstruction to be | quite fuzzy given the low space resolution. | ibrarmalik wrote: | You're under the right paper for doing this. Instead of one big | model, they have several smaller ones for regions in the scene. | This way rendering is fast for large scenes. | | This is similar to Block-NeRF [0], in their project page they | show some videos of what you're asking. | | As for an easy way of doing this, nothing out-of-the-box. You | can keep an eye on nerfstudio [1], and if you feel brave you | could implement this paper and make a PR! | | [0] https://waymo.com/intl/es/research/block-nerf/ | | [1] https://github.com/nerfstudio-project/nerfstudio | barrkel wrote: | The mirror on the wall of the bathroom in the Berlin location | looks through to the kitchen in the next room. I guess the depth | gauging algorithm uses parallax, and mirrors confuse it, seeming | like windows. The kitchen has a blob of blurriness as the rear of | the mirror intrudes into kitchen, but you can see through the | blurriness to either room. | | The effect is a bit spooky. I felt like a ghost going through | walls. | nightpool wrote: | The refigerator in the NYC scene has a very slick specular | lighting effect based on the angle you're viewing it from, and | if you go "into" the fridge you can see it's actually | generating a whole 3d scene with blurry grey and white colors | that turn out to precisely mimic the effects of the light from | the windows bouncing off the metal, and you can look "out" from | the fridge into the rest of the room. Same as the full-length | mirror in the bedroom in the same scene--there's a whole | virtual "mirror room" that's been built out behind the mirror | to give the illusion of depth as you look through it. Very cool | and unique consequence of the technology | pavlov wrote: | Wow, thanks for the tip. Fridge reflection world is so cool. | Feels like something David Lynch might dream up. | | A girl is eating her morning cereal. Suddenly she looks | apprehensively at the fridge. Camera dollies towards the | appliance and seamlessly penetrates the reflective surface, | revealing a deep hidden space that exactly matches the | reflection. At the dark end of the tunnel, something stirs... | A wildly grinning man takes a step forward and screams. | daemonologist wrote: | Neat! Here are some screenshots of the same phenomenon with | the TV in Berlin: https://imgur.com/a/3zAA5K8 | TaylorAlexander wrote: | Oh wow yeah. It's interesting because when I look at the | fridge my eye maps that to "this is a reflective surface", | which makes sense because that's true in the source images, | but then it's actually rendered as a cavity with appropriate | features rendered in 3D space. What's a strange feeling is to | enter the fridge and then turn around! I just watched | Hbomberguy's Patreon-only video on the video game Myst, and | in Myst the characters are trapped in books. If you choose | the wrong path at the end of the game you get trapped in a | book, and the view you get trapped in a book looks very | similar to the view from inside the NYC fridge! | deltaburnt wrote: | Mirror worlds are a pretty common effect you'll see in NeRFs. | Otherwise you would need a significantly more complex view | dependent feature rendered onto a flat surface. | chpatrick wrote: | This happens with any 3D reconstruction. It's because any | mirror is indistinguishable from a window into a mirrored | room. The tricky thing is if there's actually a something | behind the mirror as well. | Zetobal wrote: | It has exactly the same drawbacks as photogrammetry in regards | of highly reflective surfaces. | rzzzt wrote: | You can also get inside the bookcase for the ultimate Matthew | McConaughey experience. | promiseofbeans wrote: | It runs impressively well on my 2yo s21fe. It was super | impressive how it streamed in more images as I explored the | space. The tv reflections in the Berlin demo were super | impressive. | | My one note is that it look a really long time to load all the | images - the scene wouldn't render until all ~40 initial images | loaded. Would it be possible to start partially rendering as the | images arrive, or do you need to wait for all of them before you | can do the first big render? | duckworthd wrote: | Pardon our dust: "images" is a bad name for what's being | loaded. Past versions of this approach (MERF) stored feature | vectors in PNG images. We replace them with binary arrays. | Unfortunately, all such arrays need to be loaded before the | first frame can be rendered. | | You do however point out one weakness of SMERF: large payload | sizes. If we can figure out how to compress them by 10x, it'll | be a very different experience! | VikingCoder wrote: | Wow. Some questions: | | Take for instance the fulllivingroom demo. (I prefer fps mode.) | | 1) How many images are input? | | 2) How long does it take to compute these models? | | 3) How long does it take to prepare these models for this | browser, with all levels, etc? | | 4) Have you tried this in VR yet? | vyrotek wrote: | Not exactly what you asked for. But I recently came across this | VR example using Gaussian Splatting instead. Exciting times. | | https://twitter.com/gracia_vr/status/1731731549886787634 | | https://www.gracia.ai | duckworthd wrote: | Glad you liked our work! | | 1) Around 100-150 if memory serves. This scene is part of the | mip-NeRF 360 benchmark, which you can download from the | corresponding project website: | https://jonbarron.info/mipnerf360/ | | 2) Between 12 and 48 hours, depending on the scene. We train on | 8x V100s or 16x A100s. | | 3) The time for preparing assets is included in 2). I don't | have a breakdown for you, but it's something like 50/50. | | 4) Nope! A keen hacker might be able to do this themselves by | editing the JavaScript code. Open your browser's DevTools and | have a look -- the code is all there! | dougmwne wrote: | Do you need position data to go along with the photos or just | the photos? | | For VR, there's going to be some very weird depth data from | those reflections, but maybe they would not be so bad when | you are in headset. | durag wrote: | Any plans to do this in VR? I would love to try this. | duckworthd wrote: | Not at the moment but an intrepid hacker could surely extend | our JavaScript code and put something together. | blovescoffee wrote: | Since you're here @author :) Do you mind giving a quick rundown | on how this competes with the quality of zip-nerf? | duckworthd wrote: | Check out our explainer video for answers to this question and | more! https://www.youtube.com/watch?v=zhO8iUBpnCc | heliophobicdude wrote: | Great work!! | | Question for the authors, are there opportunities, where they | exist, to not use optimization or tuning methods for | reconstructing a model of a scene? | | We are refining efficient ways of rendering a view of a scene | from these models but the scenes remain static. The scenes also | take a while to reconstruct too. | | Can we still achieve the great look and details of RF and GS | without paying for an expensive reconstruction per instance of | the scene? | | Are there ways of greedily reconstructing a scene with | traditional CG methods into these new representations now that | they are fast to render? | | Please forgive any misconceptions that I may have in advanced! We | really appreciate the work y'all are advancing! | duckworthd wrote: | > Are there opportunities, where they exist, to not use | optimization or tuning methods for reconstructing a model of a | scene? | | If you know a way, let me know! Every system I'm aware of | involves optimization in one way or another, from COLMAP to 3D | Gaussian Splatting to Instant NGP and more. Optimization is a | powerful workhorse that gives us a far wider range of models | than a direct solver ever could. > Can we still achieve the | great look and details of RF and GS without paying for an | expensive reconstruction per instance of the scene? | | In the future I hope so. We don't have a convincing way to | generate 3D scenes yet, but given the progress in 2D, I think | it's only a matter of time. | | > Are there ways of greedily reconstructing a scene with | traditional CG methods into these new representations now that | they are fast to render? | | Not that I'm aware of! If there were, I think these works | should be on the front page instead of SMERF. | annoyingnoob wrote: | There is a market here for Realtors to upload pictures and | produce walk-throughs of homes for sale. | esafak wrote: | https://matterport.com/ | ibrarmalik wrote: | The Luma folks made something similar: | https://apps.apple.com/app/luma-flythroughs/id6450376609?l=e... | SubiculumCode wrote: | Im not sure why this demo runs so horribly in Firefox but not | other browsers..anyone else having this? | daemonologist wrote: | Runs pretty well (20-100 fps depending on the scene) for me on | both Firefox 120.1.1 on Android 14 (Pixel 7; smartphone preset) | and Firefox 120.0.1 on Fedora 39 (R7 5800, 64 GB memory, RX | 6600 XT; 1440p; desktop preset). | SubiculumCode wrote: | It seems that for some reason, my firefox is stuck in | software compositor. I am getting: | | WebRender initialization failed Blocklisted; failure code | RcANGLE(no compositor device for EGLDisplay)(Create)_FIRST | 3D11_COMPOSITING runtime failed Failed to acquire a D3D11 | device Blocklisted; failure code | FEATURE_FAILURE_D3D11_DEVICE2 | | I'm running a 3060 | jerpint wrote: | Just ran this on my phone through a browser, this is very | impressive | duckworthd wrote: | Thank you :) | catskul2 wrote: | When might we see this in consumer VR? I'm surprised we don't | already but I was suspecting it was a computation constraint. | | Does this relieve the computation constraint enough to run on | Quest 2/3? | | Is there something else that would prevent binocular use? | doctoboggan wrote: | I recently got a new quest and I am wondering the same thing. | The fact that this is currently running in a browser (and can | run on a mobile device) gives me hope that we will see | something like this in VR sooner rather than later. | duckworthd wrote: | I can't predict the future, but I imagine soon: all of the | tools are there. The reason we didn't develop for VR is | actually simpler than you'd think: we just don't have the | developer time! At the end of the day, only a handful of people | actively wrote code for this project. | nox100 wrote: | memory efficient? It downloaded 500meg! | bongodongobob wrote: | A. Storage isn't memory | | B. That's hardly anything in 2023. | duckworthd wrote: | Right-o. The web viewer is swapping assets in and out of | memory as the user explores the scene. The Network and disc | requirements are high but memory usage is low. | monlockandkey wrote: | Get this on a VR headset and you have a game changer literally. | modeless wrote: | How long until you can stitch Street View into a seamless | streaming NeRF of every street in the world? I hope that's the | goal you're working towards! | duckworthd wrote: | ;) | modeless wrote: | Haha, too bad the Earth VR team was disbanded because that | would be the Holy Grail. If someone can get the budget to | work on that I'd be tempted to come back to Google just to | help get it done! It's what I always wanted when I was | building the first Earth VR demo... | deelowe wrote: | I read another article talking about what waymo was working on | and this looks oddly similar... My understanding is that the | goal is to use this to reconstruct 3d models of street view | images in real time. | yarg wrote: | What I'm seeing from all of these things is very accurate single | navigable 3D images. | | What I haven't seen anything of is feature and object detection, | blocking and extraction. | | Hopefully a more efficient and streamable codec necessitates the | sort of structure that lends itself more easily to analysis. | fngjdflmdflg wrote: | >Google DeepMind Google Research Google Inc. | | What a variety of groups! How did this come about? | tomatotomato31 wrote: | I'm following this through two minutes paper and I'm looking | forward to using it. | | My grandpa died 2 years ago and in hindsight I took pictures for | using them as in your demo. | | Awesome thanks:) | duckworthd wrote: | It would be my dream to make capturing 3D memories as easy and | natural as taking a 2D photos with your smartphone today. | Someday! | twelfthnight wrote: | Hope this doesn't come as snarky, but does Google pressure | researchers to do PR in their papers? This really is cool, but | there is a lot of self-promotion in this paper and very little | discussion of limitations (and the discussion of them is | bookended by qualifications why they really aren't limitations). | | It makes it harder for me to trust the paper if I feel like the | paper is trying to persuade me of something rather than describe | the complete findings. | tomatotomato31 wrote: | People are not allowed to be proud of their work anymore? | yieldcrv wrote: | I had read about a competing technology that was suggesting | NeRF's were a dead end | | but perhaps that was biased? | rzzzt wrote: | What kind of modes does the viewer cycle through when I press the | space key? ___________________________________________________________________ (page generated 2023-12-13 23:00 UTC)