Using Loopback and MacOS Dictation to transcribe audio from files

MacOS includes a powerful Dictation feature, which is capable of transcribing spoken words from audio into written text. Dictation is designed to be used with live audio, brought into the Mac via a microphone, so it wouldn’t allow you to transcribe pre-recorded audio files on its own.

Fortunately, Loopback makes it possible to transcribe other audio sources. It can route audio playback from an application into Dictation, which will then transcribe the audio. Read on for more details about the process.

Create a Loopback virtual audio device

To begin, you’ll need to create a new Loopback virtual audio device. This new device can be dedicated to the task of routing audio you want transcribed by the Mac’s Dictation system. The following steps will get that device configured:

  1. Click the (+) New Virtual Device button, to create a new device. Feel free to give it a descriptive name. In this example, we’ll use the name “Transcription Device”.

  2. This new device won’t need the Pass-Thru source, which gets added by default, so it’s best to remove it: click the title of the Pass-Thru source once to highlight it, and then press the Delete button at the bottom of the window to remove it.

  3. Add the application you’ll be using to play your source audio file. Click the (+) button, at the top of the Sources column, and then find your media-playing app from the list.

    In our example, QuickTime Player will handle audio playback for the file we want to transcribe:

With the device now configured, you can quit Loopback. Loopback devices are always present and available on your system, even when the main app isn’t running.

Configure Dictation’s input device setting

Once you’ve created your virtual device, you’ll need to set it as the source for Dictation.

  1. Click the Apple () menu in the menu bar, and then select System Settings….

  2. Scroll down the sidebar to find the Keyboard setting. Click on it, and then head to the Dictation section.

  3. Change the Microphone source setting, to select the virtual device created in the previous step.

    The previously created “Transcription Device” is now being used for Dictation.

Prepare your text editor

With the Dictation settings configured, you’ll need to use a text editor as a recipient for the text generated by Dictation. The easiest option is TextEdit, the text editor bundled with MacOS, but any standard text editor is fine.

Set up your audio for playback

The last piece of setup is to prepare your media player to play your audio file source.

When your audio is played by QuickTime Player, it’ll be captured by the Loopback audio device (“Transcription Device”), and available for Dictation to use that audio to write text in TextEdit.

Start Dictation

You’re now ready to start your transcription. Start audio playback in your media player, then switch to your text editor to enable Dictation. You can select Start Dictation from the Edit menu, or use the Dictation (microphone) button on supported Apple keyboards.

When the audio has finished playing, click the Done button in the Dictation popover to stop transcribing.

That’s it! You’ll now have transcribed text in your text editor, ready for you to edit, save, and use.


Tips and troubleshooting for Dictation

Monitoring audio during transcription

While Dictation is running, MacOS will mute the default output device of your Mac, which results in playback not being audible. Beyond that, your Loopback device mutes the app audio by default, too.

If you wish to hear the audio as it’s transcribed, you can work around this from within Loopback: add the desired output device to the Monitors column of your device.

An example of an audio monitor added to the Transcription Device.

Take care to avoid monitoring audio when transcribing from an actual microphone, as it can cause an audio feedback loop that could hurt your ears and your equipment.

Avoiding transcription problems and improving transcription results

Whether you are using a hardware microphone, or following the instructions in this article to use audio files, we’ve found that Dictation may stop automatically if it is not detecting speech properly in the audio it is receiving. This is most likely to occur when working with audio that was captured from sub-optimal microphone placement or problematic recording settings. Slow or infrequent speech can have a similar effect.

Notes on sound quality

If the original speech is captured with the microphone placed far from the speaker, this can result in the unintended capture of background noise and reverberation from your environment. Similarly, placing the voice microphone off axis from the speaker, or with physical obstructions between the speaker and microphone can result in muffled audio that is less intelligible. On the recorder side, reduced input gain, record volume or audio format settings can each render otherwise clear speech unintelligible.

Improving vocal delivery

When working with optimal microphone placement and record settings, transcription performance is still limited by the speed, diction and annunciation of the speakers in your audio recordings. Subjects that speak quickly, mumble, and use rare or domain-specific vocabulary can expect lower speech recognition performance, each in detecting the presence of spoken words and in identifying them correctly.

It’s also worth noting that while carefully pacing speech can improve recognition, slowing down too much can also disrupt the process, as slow or infrequent speech can result in the dictation service timing out.

Working around Dictation’s limitations

If your recordings are already captured and it is no longer possible to adjust your microphone placement and record settings, yet the dictation service is still stopping sooner than intended, check to make sure your source audio is already playing at the maximum volume.

If Dictation continues stop after increasing the volume, try moving the playhead to start playback of your recording from a different spot. This may help to work around the point in the recording where the drop out happens. Along with this, using an external audio editor to normalize audio and/or remove long passages of silence may help to keep the dictation service running more continuously to transcribe longer passages of speech in audio recordings.

Consider alternatives to Dictation in MacOS

If Dictation continues to be unreliable or otherwise stop on its own, even after taking some of these measures, you may be better served using a third-party transcription service. Modern alternatives, including AI-based options, can be configured to provide automatic or manual transcriptions of pre-recorded audio and/or microphones. So the process outlined above can be used with any advanced systems that also require a microphone to transcribe audio.


← Back to Loopback Support Center