CUCAT Wiki | Essentials2016s2 / 1-4

An audio subsystem is basically a way that computer drivers can interface with audio hardware.

windows multimedia extension

Windows multimedia extention was the first standardized windows audio API. It was first made available in 1991 for sound cards and CD ROMS. It didn't become commercially available however, till windows 3.1 was released. The thing to note with MME is that most older soundcard drivers could only play and record one stream at a time. Multiple streams were not playable by most soundcard drivers. It can support up to 2 channels of audio at 16bit and 44.1KHZ.

WDM/Direct Sound

The WDM /direct sound model is the next model introduced with windows 98. It can record and play multiple streams at once and was popular all the way up to windows XP. The one caveat with direct sound however, was that the first device to grab and open the card is the sample rate that sounds would then run at. Example, if your screen reader was going on at the same time as your music, if the music opened it first, then you would get the higher fidelity sample rate of your music. Otherwise, if the screen reader opened it first, if the music was at a higher sample rate, it would get sampled down to the same sample rate as the speech on your screen reader. The other thing to note with this was that KMixer was also used, this meant that many things would get resampled in a very quick and dirty way in order to make everything play at the same average sample rate.

WASAPI

With the introduction of windows vista, the windows audio subsystem had a total revamp and used a more user friendly audio system. The more broad term for these interfaces were core audio API's or user mode audio components. There are four core audio API's, but the one we will care about the most for this module is the one known as WASAPI.

WASAPI, or windows audio service API is a windows audio subsystem that is not only more user friendly, but also has easier and more direct hardware depending on the option chosen.

Shared mode

In shared mode, WASAPI clients or audio streams in this case are able to play or record simultaneously and are able to do this via the assistance yet again of another software mixer known as windows audio engine. Similar to the afore mentioned KMixer, this will also lock everything to one sample rate, the difference here though is that speech from screenreaders won't make everything downsample if it happens to go first before other audio streams. example, if you try to play something at 48k but your card is locked to 44.1k, everything will be downsampled to 44.1 by the windows audio engine.

Exclusive mode

In contrast, with exclusive mode, this gives an audio stream direct access to the hardware or device. The problem here is that since it is not shared, again only one thing can happen at a time. Most of the time you will want to run your audio workstation in either this mode or in ASIO which we will talk about later. Exclusive mode advantages allow for lower latency and for being able to choose any sample rate and audio format that your device or workstation can support. Disadvantages is that you will most likely need another source for your screenreader if you are running a soundcard or interface in exclusive mode.

Asio, what is it?

Asio is generally the least latent of all of the audio subsystems. It also gives direct access to the hardware instead of going through an intermediate means of capture. Like exclusive mode in WASAPI however, it also generally takes control of an entire interface, but many professional audio interfaces or mixers have multiple inputs and outputs, therefore allowing other cards or mixers to be hooked to them so that your screen reader and your DAW can work simultaneously.

What is Latency?

It is the amount of delay between when an audio signal enters and then emerges from a system. Different audio subsystems have different approximate latency times.