Before we talk about how to spatialize sound in software, let’s start with how we perceive sound in space.
The way we perceive the position of sound in space is mostly related to our physiology: your pinna, the size of your head, the density of your body, your height, etc. You have learned over your lifetime exactly how sound passes through you and now you can accurately understand sound in space.
The basics of spatialized sound representation comes down to an encoding/decoding problem. If you wanted to record and then recreate two sound sources left and right, then i could record two channels and then when i reproduce the sound, just place two speakers at the same place that i recorded them. I could do this for as many sources and speakers as i wanted.
But as some point this becomes impractical, so we need another representation.
Object-based representation is a when you a sound source combined with it’s position/rotation data. This is how game engines represent audio sources. Other attributes include cone size, rolloff factor etc. All of these are tweakable in game engines and not in any other recording method.
Ambisonics is the ability to capture a recording including all of the space. This is an encoding.
Ambisonic recording can be done with an b-format microphone which is composed of 3 or more capsules in an XYZ arrangement.
Binaural recording captures the sound as it was heard through ears. It does not allow you to move through the space or rotate your head, because the entire spatiality is baked into the recording.
All audio you pass through this node will be spatialized. Does not necessarily need to be connected directly to the master output, but can be run through additional effects.
There is only one listener. It models the listeners head and allows you to interact with the positioned audio.
https://jsfiddle.net/yotammann/bkgsjyau/2/
Convolution is a powerful tool for mimicking realistic spaces. Reverb plays a large role in how we perceive a space.
https://jsfiddle.net/yotammann/b0eg1ooz/2/
For generating and mixing ambisonic audio, I would recommend Reaper and the Facebook Spatial Workstation.
This will allow you to mix and master in Reaper and export to a variety of formats including facebook and youtube. The Facebook Spatial Workstation also has boilerplate for Protools and Nuendo.
Each of the audio sources has an FB360 FX plugin on it. That plugin will let you adjust the position, elevation, spread and attenuation of your source.
To export the audio from Reaper or another DAW, carefully follow the instructions in the PDF provided with Facebook Spatial Workstation. You can then combine the audio and video file in Facebook’s Video Encoder.
FB360 also provides a flexible video player which will synchronize to your DAW and allow you to navigate in 360 and hear the binaural decoding.