Raspberry Pi Speech Commands with PyTorch
Raspberry Pi Speech Commands with PyTorch

 

Raspberry Pi Speech Recognition

This labnote was last updated on 1/1/2020

Overview

Basic speech command recognition with PiTorch on the Raspberry Pi.

This project is based on the example on the PyTorch website found here:

https://pytorch.org/tutorials/intermediate/speech_command_recognition_with_torchaudio.html

 

We use a slightly modified version of the example program to train the ANN to recognize the spoken commands.  This program runs on the a high end desktop with a NVIDIA GeForce GTX 10810 Ti GPU. 

The second program is for the Raspberry Pi and uses the trained ANN to recognize the spoken commands.  We used PyAudio instead of TorchAudio for this version.  At the time of the this article, there was not a handy version of TorchAudio for the Raspberry Pi available.

The training data if from the SpeechCommand dataset which includes 35 commands.  Which can be found here:

https://arxiv.org/abs/1804.03209

 

 

System Requirements

  • Raspberry Pi (3 or 4)
  • Rasberry Pi OS 64 bit
  • PyTorch
  • Microphone

 

Training Program

The training program is setup to take advantage of a GPU if avail.

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

print(device)

We added some code to save and load the model:

print("Loading weights...\n")

model= torch.load('D:\Pytorch\Speech001')

print("Weights Loaded!\n")

If you want to start with a clean ANN, must comment this code out.

 

After the ANN is trained or with the program exits, the ANN model is saved:

model.saveWeights(model)

 

 

Training Data

SpeechCommands version .02 was used for this project.

It includes the following commands:

  • backward
  • bed
  • bird
  • cat
  • dog
  • down
  • eight
  • five
  • follow
  • four
  • go
  • happy
  • house
  • learn
  • left
  • marvin
  • nine
  • no
  • off
  • on
  • one
  • right
  • seven
  • sheila
  • six
  • stop
  • three
  • tree
  • two
  • up
  • visual
  • wow
  • yes
  • zero

Source Code

This project includes three python 3 program and a trained copy of the ANN. 

  • SpeechCommand_Trainer  - full program including code to train the ANN
  • Speech_Test_Mic – testing code only
  • Speech_Test_Mic_RPi – testing code only, modified to run on a Raspberry Pi 4
  • Trained ANN file (Speech001)

Source Code can be downloaded here: SpeechCommand_PiTorch_Rpi

Testing

We added code to allow a spoken command to be tested against the trained ANN.

Press enter the when “Recording” appears – speak a command

 

Speech Recognition Testing

Type “exit” to end the program.

Speech Recognition testing

 

Raspberry Pi Version

Speech Recognition with a Raspberry Pi 4

 

Speech_Test_NoPyTorchAudio comments:

To run the test application on the Raspberry Pi, we had to remove the TorchAudio library.  PyAudio is used instead to handle the microphone requirements.

The work labels are hard coded in the test code.

Error trapping has been added to supress ALSA warnings.

## Added to suppress ALSA warnings

ERROR_HANDLER_FUNC = CFUNCTYPE(None, c_char_p, c_int, c_char_p, c_int, c_char_p)

def py_error_handler(filename, line, function, err, fmt):

    pass

c_error_handler = ERROR_HANDLER_FUNC(py_error_handler)

 

@contextmanager

def noalsaerr():

    asound = cdll.LoadLibrary('libasound.so')

    asound.snd_lib_error_set_handler(c_error_handler)

    yield

    asound.snd_lib_error_set_handler(None)

 

Basic speech command recognition with PiTorch on the Raspberry Pi.

Copyright 2022 - Zagros Robotics, All Rights Reserved - Please send webpage comments or corrections to webmaster@zagrosrobotics.com - Zagros Robotics,PO Box 460342, St. Louis, MO 63146, info@zagrosrobotics.com for answers to any questions.