[HN Gopher] Notes on Vision Pro ___________________________________________________________________ Notes on Vision Pro Author : firloop Score : 57 points Date : 2023-06-06 21:55 UTC (1 hours ago) (HTM) web link (notes.andymatuschak.org) (TXT) w3m dump (notes.andymatuschak.org) | xivusr wrote: | Great notes and I really liked his idea of large persistent info | spaces and sharing those with others. | gnicholas wrote: | > _Given how ambitious the hardware package is, the software | paradigm is surprisingly conservative._ | | I actually see this as a bit less surprising. After all, if you | change the hardware in a big way, and you change the software in | a big way, users will have a harder time adjusting to the new | platform. Instead, they're making a big leap on the hardware | side, keeping legacy apps and concepts, and then will presumably | iterate to more 'native' experiences that were previously | impossible/unimaginable/unintuitive. | gutnor wrote: | That was repeated a bunch of time during the keynote: | "familiar" | balaji1 wrote: | In the demo, I was hoping to see some app/screen/display | anchored to certain walls or being fixed in a 3D spot. That way | I can walk to a place to see that "screen". | endisneigh wrote: | Isn't that an intractable problem? I suppose the naive | solution would be to fix the screens in spots as you suggest | as opposed to pinning onto real life locations | fauntle wrote: | It should be relatively simple with a QR code or similar | physical marker in the real world. | lagrange77 wrote: | In the demo there is a scene where a guy displays some | iceberg photo on a virtual screen and then moves closer to | it, with the screen anchored to some 3D point in the real | word. | stephc_int13 wrote: | Text is still a large part of the interfaces we use. | | We are all highly trained to read text, it seems basic but it | is in practice quite abstract. | | Text is still best read on a flat surface. | | The great innovation I can see with this new Apple device is | eye tracking, they have not invented it, but they might have | perfected it enough to be useable. | | Eyes could be better than a mouse. | roblh wrote: | Yeah, if their eye tracking plus foveated rendering works as | advertised, it could be a huge step forward. I'm really | curious how responsive the gesture controls will be too, it | was really cool seeing the finger pinches(?) being used as an | input method. I wonder if it's specifically designed just for | that thing or if it's all built out to track any arbitrary | hand gesture accurately. And I wonder what the language/api | for describing hand gestures would even look like. | Gigachad wrote: | Eye tracking is almost certainly more accurate and faster. To | the point where we make it an entire game to see who can who | can move the mouse to what there eyes are already looking at | fastest, in the form of shooter games. | gnicholas wrote: | Interesting point -- if you no longer need to use a | joystick to aim your weapon, how will controllers evolve? | Will the second joystick be used for some other function, | or will it be replaced by a different type of input method? | | It would be funny if controllers evolved to be more like | the single-joystick models that we had decades ago, with | the joystick on the left and rows of buttons on the right. | History doesn't repeat itself, but perhaps it'll rhyme? | russellbeattie wrote: | > _But it does put an enormous amount of pressure on the eye | tracking. As far as I can tell so far, the role of precise 2D | control has been shifted to the eyes._ | | I've been researching eye tracking for my own project for the | past year. I have a Tobii eye tracker which is probably the best | eye tracking device for consumers currently (or the only one | really). It's much more accurate than trying to repurpose a | webcam. | | The problem with eye tracking in general is what's called the | "midas touch" problem. Everything you look at is potentially a | target. If you were to simply connect your mouse pointer to your | gaze, for example, any sort of hover effect on a web page would | be activated simply by glancing at it. [1] | | Additionally, our eyes are constantly making small movements call | saccades [2]. If you track eye movement perfectly, the target | will wobble all over the screen like mad. The ways to alleviate | this are by expanding the target visually so that the small | movements are contained within a "bubble" or by delaying the | targeting slightly so the movements can be smoothed out. But this | naturally causes inaccuracy and latency. [3] Even then, you can | easily get a headache from the effort of trying to fixate your | eyes on a small target (trust me). Though Apple is making an | effort to predict eye movements to give the user the impression | of lower latency and improve accuracy, it's an imperfect | solution. Simply put, gazing as an interface will always suffer | from latency and unnatural physical effort. Until computers can | read our minds, that isn't going to change. | | Apple decided to incorporate desktop and mobile apps into the | device, so it seems this was really their only choice, as they | need the equivalent of a pointer or finger to activate on-screen | elements. They could do this with hand tracking, but then there's | the issue of accuracy as well as clicking, tapping, dragging or | swiping - plus the effort of holding your arms up for extended | periods. I think it's odd that they decided that voice should not | be part of the UI. My preference would be hand tracking a virtual | mouse/trackpad (smaller and more familiar movements) plus a | simple, "tap" or "swipe" spoken aloud, with the current system | for "quiet" operation. But Apple is Apple, and they insist on one | way to do things. | | But who knows - I haven't tried it yet, maybe Apple's engineers | nailed it. I have my doubts. | | 1. https://uxdesign.cc/the-midas-touch-effect-the-most- | unknown-... | | 2. https://en.m.wikipedia.org/wiki/Saccade | | 3. https://help.tobii.com/hc/en-us/articles/210245345-How-to- | se... | ketzo wrote: | From reading/listening to reports of people who were able to | demo the device, I think Apple may have nailed it, or come | close. Everyone I've seen has absolutely raved about how | accurate and intuitive the eye tracking + hand gesture feels. | zmmmmm wrote: | Interesting notes. | | I'm disappointed even though it's entirely predictable that | VisionOS is built on the iOS/iPadOS foundation rather than OSX. I | guess we'll see how "walled in" it is but it's hard to see any | reason Apple isn't going to be just as constraining and | controlling about what happens on this OS as they are on iOS, if | not more so. Which ultimately means I'll be very reluctant to | ever adopt it in any meaningful way as my primary computing | device. | kfarr wrote: | Definitely pros and cons. I remember reading that because of | the tight integration with iOS this allowed them to achieve | best in class latency for iPad + pencil that couldn't be | achieved on any other platform. Having followed Oculus / Quest | development that low latency is not optional in this context | and every millisecond counts so I can see why they would go | this route. | | On the other hand, the closed ecosystem is definitely cause for | concern. Fingers crossed that WebXR support comes out from | behind a feature flag to allow for progressive (spatial) web | apps. | wahnfrieden wrote: | Be more specific what you mean by building on "OSX". AppKit is | very old and crusty. Why would they build on AppKit? You want | that? | bodge5000 wrote: | Yeh this was my main thought. However good the hardware is | doesn't matter if the software can't fully utilise it. The idea | of virtual displays (the most obvious immediate benefit of | Vision imo) for MacOS seems like a huge benefit, but for iPadOS | its downgraded to pretty cool. I've never felt any particular | need to add multiple displays to my iPad, and whilst a big | display would be nice, I wouldn't describe it as groundbreaking | (especially as others such as XReal are already doing this in a | much smaller form factor). | JohnBooty wrote: | But it does put an enormous amount of pressure on the | eye tracking. As far as I can tell so far, the role of | precise 2D control has been shifted to the eyes. | | I've got one good eye and one bad eye. The bad eye is legally | blind, has an off-center iris and is kind of lazy w.r.t. | tracking. | | I'm _extremely_ curious to know how Vision Pro deals with this. | One certainly hopes there 's some kind of "single eye" mode; | certainly seems possible with relatively small effort and the % | of the population who'd benefit seems fairly significant. | | Eye tracking most certainly sounds like the way to go, relative | to hand-waving. | | The Minority Report movie probably set the industry back by a | decade or two. Waving your hands around to control stuff seems | logical but is quickly exhausting. | miketery wrote: | Track record wise, apple is one of the best in terms of serving | accessibility. So I'd bet greater than 50% odds that they're | thinking about lazy eye or one eye or derivatives there of. | gnicholas wrote: | It would have been much less exciting to see Tom Cruise sitting | on a couch, hands in his lap, gently flicking his fingers to | scroll through crime scene footage. IIRC he talked about how | tired his arms got during filming. | | EDIT: found it -- he didn't talk about it, but it was reported | that he had to frequently rest his arms: | https://medium.com/@LeapMotion/taking-motion-control-ergonom... | gnicholas wrote: | PSA: this page only scrolls if your mouse is on the left side. | acherion wrote: | Good thing my browser settings are to show scrollbars all the | time! | gnicholas wrote: | Interestingly 'meta' given that Andy talks about the | disappearance of interactivity cues that happened about a | decade back... | recursive wrote: | Firefox reader mode works well here. | gnicholas wrote: | Brave's reader mode wasn't shown as an option, though my | BeeLine Reader extension was able to parse the text when I | invoked its Clean Mode. I was a little surprised, since the | BeeLine extension didn't detect/color the text inline, like | it usually does. | bmacho wrote: | Windows works the same (for me at least), scrolling follows the | cursor (but input does not). ___________________________________________________________________ (page generated 2023-06-06 23:00 UTC)