OSSIC debuted their latest OSSIC X headphone prototype at CES this year with one of the best immersive audio demos that I’ve heard yet. OSSIC CEO Jason Riggs told me that their headphones do a dynamic calibration of your ears in order to render out near-field audio that is customized to your anatomy, and they had a new interactive audio sandbox environment where you could do a live mix of audio objects in a 360-degree environment at different heights and depths. OSSIC also was a participant in Abbey Road Studio’s Red Incubator looking at the future of music production, and Riggs makes the bold prediction that the future of music is going to be both immersive and interactive.
LISTEN TO THE VOICES OF VR PODCAST
We do a deep dive into immersive audio on today’s podcast where Riggs explains in detail their audio rendering pipeline and how their dynamic calibration of ear anatomy enables their integrated hardware to replicate near-field audio objects better than any other software solution. When audio objects are within 1 meter, then they use a dynamic head-related transfer function (HRTF) in order to calculate the proper interaural time differences (ITD) and interaural level differences (ILD) that are unique to your ear anatomy. Their dynamic calibration also helps to localize high frequency sounds from 1-2 kHz when they are in front, above, or behind you.
Riggs says that they’ve been collaborating with Abbey Road Studios in order to figure out the future of music, which Riggs believes that is going to be both immersive and interactive. There are two ends of the spectrum from audio production ranging from pure live capture and pure audio production, which happens to mirror the differences between passive 360 video capture and interative, real-time CGI games. Right now the music industry is solidly in the static, multi-channel-based audio, but that the future tools of audio production are going to look more like a real-time game engine than the existing fixed perspective and flat-world, audio mixing boards, says Riggs.
OSSIC has started to work on figuring out the production pipeline for the passive, pure live capture end of the spectrum first. They’ve been using higher-order ambisonic microphones like the 32-element em32 Eigenmike microphone array from mh acoustics. They’re able to capture a lot more spatial resolution than with a standard 4-channel, first-order ambisonic microphone. Both of these approaches capture a sound sphere shell of a location with all of it’s directed and reflected sound properties that can transport you to another place.
But Riggs says that there’s a limited amount of depth information that can be captured and transmitted with this type of passive and non-volumetric ambisonic recording. The other end of the spectrum is pure audio production, which can do volumetric audio that is real-time and interactive by using audio objects in a simulated 3D space. OSSIC produced an interactive audio demo using Unity that is able to produce audio in the near-field of less than 1 meter distance.
The future of interactive music faces similar challenges to the similar tension between 360 videos and interactive game environments, which is that it’s difficult to balance the user’s agency with the process of creating authored compositions. Some ways to incorporate interactivity with a music experience is to allow the user to live mix an existing authored music composition with audio objects in a 3D space or to play an audio-reactive game like AudioShield that creates dynamic gameplay based upon the unique sound profile of each piece of music. These are ways to engage the agency of the user, but neither of these actually provide any meaningful way for the user to impact how the music composition unfolds.
Finding that balance between authorship and interactivity is one of the biggest open questions about the future of music, and no one really knows what that will look like. The only thing that Riggs knows for sure is that real-time game engines like Unity or Unreal are going to be much more well-suited to facilitate this type of interaction than the existing tools of production of channel-based music.
Multi-channel ambisonic formats are becoming more standardized for the 360-videos platforms on Facebook and Google’s YouTube, but there is still only output binaural stereo output. Riggs says that he’s been working behind the scenes to provide higher level fidelity outputs for integrated immersive hardware solutions like the OSSIC X since they’re currently not using the best spatialization process to get the best performance out of the OSSIC headphones.
As far as formats for the other end of pure production, there is no emerging standard for an open format of object-based audio. Riggs hopes that eventually this will come, and that there will be plugins for OSSIC headphones and software to be able to dynamically change the reflective properties of a virtualized room, or to be able to dynamically modulate properties of the audio objects.
As game engines eventually move to real-time, physics-based audio propagation models where sound is constructed in real-time, Riggs says that this will still need good spatialization with integrated hardware and software solutions otherwise it’ll just sound like good reverb without any localized cues.
At this point, audio is still taking a backseat to the visuals with a limited 2-3% budget of CPU capacity, and Riggs hopes that there will be a series of audio demos in 2017 that show the power of properly spatialized audio. OSSIC’s interactive sound demo at CES was the most impressive example of audio spatialization that I’ve heard so far, and they’re shaping up to be the real leader of immersive audio. Riggs said that they’ve got a lot of feedback from game studios that they don’t want to use a customized audio production solution by OSSIC, but they want to use their existing production pipeline and have OSSIC be compatible with that. So VR developers should be getting more information for how to best integrate with the OSSIC hardware in 2017 as their OSSIC X headphones will start shipping in Spring of this year.