13 Azure Speech

20200806 For humans, speech is an integral part of communication. The MLHub package azspeech can transcribe speech to text, and can create (synthesise) speech from text. To do this the package utilises pre-built speech models provided through Azure’s Cognitive Services. It actually supports many languages and voices, so within a pipeline, a male English speaker can generate a speech presented in a female French voice.

Most of the commands provided by the package will accept an audio file or will record audio from the computer’s microphone and play the synthesised audio through the computer’s speakers.

To install, configure, and demonstrate the package:

ml install   azspeech
ml configure azspeech
ml readme    azspeech
ml commands  azspeech
ml demo      azspeech
ml gui       azspeech

In addition to the demo command the package supports synthesize, and transcribe:

ml sythesize  azspeech myspeech.txt
ml transcribe azspeech myspeech.wav

The source code for this MLHub package is available from github: https://github.com/gjwgit/azspeech.

Azure-based models, unlike the MLHub models in general, use closed source services which have no guarantee of ongoing availability and do not come with the freedom to modify and share. This cloud based service also sends your text (for synthesis) and audio (for transcription) to the Axure for analysis.

Your donation will support ongoing development and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984.
Copyright © 1995-2021 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0.