Meta today announced plans to update Quest 3 with AI vision features similar to what’s on the company’s latest Meta Ray-Ban glasses. Meanwhile, Apple still hasn’t confirmed if Apple Intelligence features will hit Vision Pro at the same time as other Apple devices this Fall.

Meta has pointed to AI investments as its other major area of R&D alongside XR, and now the company is starting to bring the two together with consumer-facing AI features.

Sometime later this Summer in the US and Canada, Meta says it will roll out an update to Quest 3 enabling the ‘Meta AI with Vision’ feature. The feature will give the headset AI voice chat capabilities and also allow the headset to ‘see’ what’s in the user’s real-world field-of-view. Users can ask general questions but also inquire about things in front of them. Meta gives some examples in its announcement:

Let’s say you’re watching YouTube videos of some breathtaking hikes in mixed reality while packing for your upcoming trip to Joshua Tree. You can ask Meta AI for advice on how to best dress for the summer weather. Or you could hold up a pair of shorts and say, “Look and tell me what kind of top would complete this outfit.” You can get the forecast so you can prep for the weather ahead and even ask for local restaurant recommendations to indulge your inner foodie.

Or say you’re in-headset and listening to music while working on a paper for school on a massive virtual monitor. You could ask Meta AI to identify some of the most memorable quotes from Shakespeare’s Hamlet and explain the significance of the “to be or not to be” soliloquy or the play within a play.

You might be playing Assassin’s Creed® Nexus VR, parkouring across rooftops when your curiosity is piqued. Why not ask Meta AI whether or not there were actual assassins in colonial Boston? The answer may surprise you…

For the time being, the camera can only see what’s in the real world, but won’t have awareness of virtual content shown by the headset. Meta is alluding that the Meta AI with Vision feature may eventually include both real world and virtual world awareness.

Unfortunately Meta confirms the feature won’t be coming to Quest 2 or older devices (and it’s likely Quest Pro won’t get it either).

Meta didn’t talk about whether requests for this feature are being processed on-device or in the cloud, nor did it address things like encryption, though it did confirm the feature is based on Bing AI. The company hasn’t yet responded to our request for more info on privacy architecture.

SEE ALSO
Microsoft is Launching Automatic Quest 3 Pairing on Windows 11 PCs in December

While Meta is rapidly deploying AI capabilities to its devices, Apple has yet to confirm if its so-called ‘Apple Intelligence’ features are coming to Vision Pro.

Earlier this year Apple announced a range of Apple Intelligence features coming to iPhones, iPads, and Macs in beta this fall. Despite announcing visionOS 2 at the same time, no Apple Intelligence features have been confirmed for Vision Pro. That leaves it up in the air whether the headset will get any Apple Intelligence features at the same time as other Apple devices, or if users will need to wait until later versions of VisionOS.

Apple says many Apple Intelligence features are processed on-device, but some requests—including those which lean on ChatGPT—will request off-device processing. Apple claims off-device requests “never store your data,” “are used only for your requests,” and that Apple will make the behind-the-scenes code available for privacy auditing.

Newsletter graphic

This article may contain affiliate links. If you click an affiliate link and buy a product we may receive a small commission which helps support the publication. More information.


Ben is the world's most senior professional analyst solely dedicated to the XR industry, having founded Road to VR in 2011—a year before the Oculus Kickstarter sparked a resurgence that led to the modern XR landscape. He has authored more than 3,000 articles chronicling the evolution of the XR industry over more than a decade. With that unique perspective, Ben has been consistently recognized as one of the most influential voices in XR, giving keynotes and joining panel and podcast discussions at key industry events. He is a self-described "journalist and analyst, not evangelist."
  • xyzs

    Yep, but AVP has OLED.

    • ViRGiN

      And Valve Index uses lighthouse tracking.

  • While I'm a big fan of MR+AI, I think at the current stage, this AI is mostly useless. On Ray-Ban Meta, it makes sense because they are glasses you potentially wear always, and you wear them on the street, but no one wears the Quest 3 walking in the house or in the streets, so the use of AI is IMHO very limited

    • ViRGiN

      How can you be so full of hate after your positively tinted, sponsored Somnium article?

  • flynnstigator

    I'll sell my Q3 if they implement this and if I can't turn it off or block it somehow. Make no mistake, these AI vision features are not intended to benefit us, they're intended to better turn our lives into data for sale to the highest bidder without providing anything in return. Sale of personal data has been Meta's entire business model since the early days of Facebook, and is the main reason they've spent so much money to dominate the XR space.

    Right now they can't make use of data from the passive video cameras because of the bandwidth required to upload it to servers for processing, but if they can offload that work to the device, they'll be able to better monetize what the device sees.

    I think I'm going to turn off automatic updates just in case they roll it out with the next one.

    • ViRGiN

      lol

    • VRDeveloper

      I think the same, especially when it comes to Facebook, if they are caught in any data scandals, it will be brutal for sales. It's hard enough to convince people to buy a VR device, now it could get even harder, and I'm still very enthusiastic, but I'm worried about this idea.

    • Christian Schildwaechter

      There is dystopian perspective and real danger from data collection and sale that urgently needs to be addressed by laws limiting what can be collected and sold. But we absolutely benefit. LLMs now provide (partly reliable) answers instead of only referring to a source they found. And even those somewhat matching search engines references evolved way beyond early ones only looking for keywords. Google gives excellent answer to questions incorrectly describing what we are looking for, as it learned what we probably meant instead.

      I regularly benefit from sites using my purchase history to recommend things I actually have a use for, and didn't even know existed. I'm not comfortable with the power this gives them and try to limit how much personal information can be cross-referenced, but don't want data collection to fully stop. Instead I want transparency on what is collected and shared, and having a say in that.

      I actually need them to collect more data to get an instant 10min solution from my future Quest/AVP, when looking at a broken faucet and asking "how do I repair that with the tools at hand". Instead of driving to a library, hoping to find a matching book, then go hunting for unnecessary tools at local stores. No longer having to do that thanks to things getting "smart" already improved my life a lot, and XR+AI could allow for much more.

      • flynnstigator

        That’s an interesting perspective to hear, sincerely. I’ve actually never talked with someone who felt that recent AI-enhanced changes to search and software had made their results better, rather than worse. For me, I find that I always get the same incorrect results no matter how I phrase my search, and I simply cannot find a lot of the information I used to be able to find because the search engine has made incorrect assumptions about what it thinks I must have meant instead of just taking my search terms at face value.

        With software in general, I find that I’m forever fighting against its constant attempts to “correct” what I’m trying to do, like word processors expanding my cursor select to encompass the whole paragraph, assuming that I must have meant to click differently than I did. It makes computing a lot more time-consuming, cumbersome, and user-hostile than the old “dumb” approach of Word 2003 that simply (and correctly) took my clicks as gospel and allowed me to own any mistakes.

        Even my wife’s car treats my pedal inputs like an optional, low-priority entry in a suggestion box rather than a direct command to be translated 1:1 to engine response.

        The common theme here is a group of designers and engineers who think that their tools and training are so good that they can override user input as part of the design and that it will improve the product. Maybe in your case they’re right, but they’re very wrong in mine and most of the people I talk to. So when I have to face increased privacy invasion along with a worse product, it’s like a second slap in the face.

    • Blaexe

      There's definitely benefit for the user – ask any Ray Ban Smartglasses user that has access to the multimodal AI.

  • Christian Schildwaechter

    Meta didn’t talk about whether requests for this feature are being processed on-device or in the cloud, nor did it address things like encryption, though it did confirm the feature is based on Bing AI.

    Microsoft stated that the average Bing question answered by ChatGPT consumes ~50,000 times the energy of a regular Bing search. Partly due to search companies having made search in their data centers extremely energy efficient and cheap, so they can offer services for free and still make lots money from people occasionally clicking on ads.

    BigTable/MapReduce boosted search engines, and LLMs now allow for much smarter "smart assistants". But current energy and memory needs constrain on-device use on mobile HMDs to simpler tasks., with everything complex redirected to a data center. Early Amazon Echo devices could (mostly) discern someone saying "Alexa" from cat noise, but everything beyond that required sending what came next to a data center. Unfortunately for Amazon people didn't buy enough via smart assistants to make up for the expensive compute, so by 2022 their Alexa business lost USD 10bn each year.

    Microsoft, Facebook and Apple all released smaller, less resource hungry, but still capable LLMs targeting mobile. The more impressive AI answers, like those requiring visually processing, will for at least some years still require sending data off-device for interpretation at rather high costs, paid by these companies as investment into AI. "Meta AI with Vision" on Quest 3 will largely be the same (web) service they run for smartphones, tuned for HMDs.

  • Stephen Bard

    Instead of the Meta AI being only a disembodied voice, it would be fun to have the option of the AI speaking as a 3D VR character like the popular "Liteforms" holographic chatbots that the "Looking Glass" glasses-free 3D display people offer. Maybe the AI could allow you to create these virtual assistant characters from a text prompt description . . .

  • NicoleJsd

    AI is so useless… all the examples either can be googled or just you know, common sense?

    All this is effort, power and infrastructure to tell me weather? To tell me I should wear shorts in summer? While wearing VR hmd? Why??

    • NicoleJsd

      Problem is earth materials are not infinite and so wasting precious limited resources on such stupid shit is criminal and easily should warrant jail time.

      • Ardra Diva

        Why aren't you living off the grid in a shack? You're just part of the problem if you're here on the internet using up precious, finite earth materials.

  • Ardra Diva

    reminds me of the Fast and Furious movie where they smoke a Ferrari in some little hooptie with a NOS pack. Q3 smokes the VP when you look at bang-for-buck.