22.6 azspeech transcribe

20221005

The transcribe command will, by default, listen for up to 15 seconds of speech from the microphone and then convert it to text, written to the console. The command can also be used to transcribe speech from an audio file (wav) provided as the FILENAME argument. The source language may be required, though several languages are automatically identified.

$ ml transcribe azspeech [FILENAME]
     -l <lang>       --lang=<lang>

A simple example, listening for the audio on the microphone:

ml transcribe azspeech

might result in:

The machine learning hub is useful for demonstrating capability of 
models as well as providing command line tools.

The command can take an audio wav file, as an optional argument, and transcribe it to the console. For large audio files this can take some time. Currently only wav files are supported through the command line (though the cloud service also supports mp3, ogg, and flac).

wget https://github.com/realpython/python-speech-recognition/raw/master/audio_files/harvard.wav
ml transcribe azspeech harvard.wav

This will transcribe the audio as the following text:

The stale smell of old beer lingers it takes heat to bring out the odor.
A cold dip restore's health and Zest, a salt pickle taste fine with
Ham tacos, Al Pastore are my favorite a zestful food is the hot cross bun.

To convert from other audio formats to a suitable wav file see the section on converting audio formats in the GNU/Linux Desktop Survival Guide.

To save the output to a text file simply use the shell redirect operator >.

$ ml transcribe azspeech harvard.wav > harvard.txt

$ cat harvard.txt
The stale smell of old beer lingers it takes heat to bring out the odor.
A cold dip restore's health and Zest, a salt pickle taste fine with
Ham tacos, Al Pastore are my favorite a zestful food is the hot cross bun.

The transcribe command will only record up to 15 seconds. To transcribe more that this from your own recorded voice, simply save the recording into a file and then transcribe that file. See the section on recording audio from the GNU/Linux Desktop Survival Guide for details.

A powerful graphical tool to record audio is audacity but a simple command line application to record from the computer’s microphone is arecord.

arecord -f S16_LE myrecording.wav

Terminate the recording session with Ctrl-C and then trascribe the recording:

ml transcribe azspeech myrecording.wav > myrecording.txt


Your donation will support ongoing availability and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984. Copyright © 1995-2022 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0