The Facebook Reality Labs Research team shared some of its latest audio initiatives today. The group aims to build technologies into an AR headset that will supercharge your hearing by making it easy to isolate the sound of your conversation in a noisy environment, and to be able to reproduce virtual sounds that seem like they’re coming from the real world around you. A custom HRTF (Head-related Transfer Function)—a digital version of the unique way each person hears sound based on the shape of their head and ears—is key to delivering such experiences, but the process is time consuming and expensive. The team is investigating a scalable solution which would generate an accurate HRTF from a simple photograph of your ear.
Facebook Reality Labs (FRL) is the newly adopted name of team at Facebook which is building immersive technologies (including Oculus headsets). Facebook Reality Labs Research (FRLR) is the research & development arm of that team.
Today Facebook Reality Labs Research shared an update on a number of ongoing immersive audio research initiatives, saying that the work is “directly connected to Facebook’s work to deliver AR glasses,” though some of the work is also broadly applicable to VR as well.
Spatial Audio
One of the team’s goals is to recreate virtual sounds that are “perceptually indistinguishable” from the sound of a real object or person in the same room with you.
“Imagine if you were on a phone call and you forgot that you were separated by distance,” says Research Lead Philip Robinson. “That’s the promise of the technology we’re developing.”
In order to achieve that goal, the researchers say there’s two key challenges: 1) understanding the unique auditory characteristics of the listener’s environment, and 2) understanding the unique way that the listener hears sounds based on their physiology.
Understanding the acoustic properties of the room (how sounds echo throughout) can be done by estimating how the room should sound based on the geometry that’s already mapped from the headset’s tracking sensors. Combined with AI capable of estimating the acoustic properties of specific surfaces in the room, a rough idea of how a real sound would propagate through the space can be used to make virtual sounds seem as if they’re really coming from inside the same room.
Facebook researchers also say that this information could be added to LiveMaps—an augmented reality copy of the real world that Facebook is building—and recalled by other devices in the same space in a way that the acoustic estimation could be improved over time through crowd-sourced data.
The second major challenge is understanding the unique way everyone hears the world based on the shape of their head and ears. The shape of your head and ears doesn’t just ‘color’ the way you hear, it’s also critical to your sense of identifying where sounds are coming from around you; if you borrowed someone else’s ears for a day, you’d have a harder time pinpointing where exactly sounds were coming from.
The science of how sound interacts with differently shaped ears is well understood enough that it can be represented with a compact numeric function—called a Head-related Transfer Function (HRTF). But accurately measuring an individual’s HRTF requires specialized tools and a lengthy calibration procedure—akin to having a doctor test your eyes for a vision prescription—which makes it impractical to scale to many users.
To that end, Facebook Reality Labs Research says it hopes to “develop an algorithm that can approximate a workable personalized HRTF from something as simple as a photograph of [your] ears.”
To demonstrate the work the team has done on the spatial audio front, it created a sort of mini-game where participants wearing a tracked pair of headphones stand in a room with several real speakers scattered throughout. The team then plays a sound and asks the participant to choose whether the sound was produced virtually and played through the headphones, or if it was played through the real speaker in the room. The team says that results from many participants show that the virtual sounds are nearly indistinguishable from the real sounds.
Context-aware Noise Cancellation
While “perceptually indistinguishable” virtual sounds could make it sound like your friend is right next to you—even when they’re communicating through a headset on the other side of the country—Facebook Reality Labs Research also wants to use audio to enhance real, face-to-face conversations.
One way they’re doing that is to create contextually aware noise cancellation. While noise cancellation technology today aims to reduce all outside sound, contextually aware noise cancellation tries to isolate the outside sounds that you want to hear while reducing the rest.
To do this, Facebook researchers built prototype earbuds and prototype glasses with several microphones, head tracking, and eye-tracking. The glasses monitor the sounds around the user as well as where they’re looking. An algorithm aims to use the information to figure out the subject the user wants to listen to—be it the person across the table from them, or a TV in the corner of the room. That information is fed to the audio processing portion of the algorithm that tries to sift through the incoming sounds in order to highlight the specific sounds from the subject while reducing the sounds of everything else.
– – — – –
Facebook is clear that it is working on this technology with the goal of eventually bringing it to AR and VR headsets. And while researchers say they’ve proven out many of these concepts, it isn’t yet clear how long it will be until it can be brought out of the lab and into everyday headsets.