This labnote was last updated on 1/1/2020
Basic speech command recognition with PiTorch on the Raspberry Pi.
This project is based on the example on the PyTorch website found here:
We use a slightly modified version of the example program to train the ANN to recognize the spoken commands. This program runs on the a high end desktop with a NVIDIA GeForce GTX 10810 Ti GPU.
The second program is for the Raspberry Pi and uses the trained ANN to recognize the spoken commands. We used PyAudio instead of TorchAudio for this version. At the time of the this article, there was not a handy version of TorchAudio for the Raspberry Pi available.
The training data if from the SpeechCommand dataset which includes 35 commands. Which can be found here:
- Raspberry Pi (3 or 4)
- Rasberry Pi OS 64 bit
The training program is setup to take advantage of a GPU if avail.
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
We added some code to save and load the model:
If you want to start with a clean ANN, must comment this code out.
After the ANN is trained or with the program exits, the ANN model is saved:
SpeechCommands version .02 was used for this project.
It includes the following commands:
This project includes three python 3 program and a trained copy of the ANN.
- SpeechCommand_Trainer - full program including code to train the ANN
- Speech_Test_Mic – testing code only
- Speech_Test_Mic_RPi – testing code only, modified to run on a Raspberry Pi 4
- Trained ANN file (Speech001)
Source Code can be downloaded here: SpeechCommand_PiTorch_Rpi
We added code to allow a spoken command to be tested against the trained ANN.
Press enter the when “Recording” appears – speak a command
Type “exit” to end the program.
Raspberry Pi Version
To run the test application on the Raspberry Pi, we had to remove the TorchAudio library. PyAudio is used instead to handle the microphone requirements.
The work labels are hard coded in the test code.
Error trapping has been added to supress ALSA warnings.
## Added to suppress ALSA warnings
ERROR_HANDLER_FUNC = CFUNCTYPE(None, c_char_p, c_int, c_char_p, c_int, c_char_p)
def py_error_handler(filename, line, function, err, fmt):
c_error_handler = ERROR_HANDLER_FUNC(py_error_handler)
asound = cdll.LoadLibrary('libasound.so')