Hotword Detection with Raspberry Pi

About / Synopsis

Activation of certain user function after hotword detection
Project status: prototype

See real examples:

Installation

Git clone this repository to your Raspberry Pi link.
Create Python virtual environment and install packages from requirements.txt

Usage

$ python3 main.py

Screenshots

Below is an example of a Mel Spectrogram, model prediction and true labels for a training example from the dataset. Dev set accuracy 97.7%. More images could be found in images directory.

Code

Content

The idea of this project was based on the real world examples, such as Alexa, Siri, Google Assistant. Also, additional knowledge was taken from Coursera

Requirements

See requirements.txt

Additional setup notes

License

Apache License, Version 2.0

About project

This project is part of my bachelor's degree project. The idea of not using end-to-end solution for a simple voice assistant, and filter request by the hotword was taken from real world example which were listed above. Data preprocessing step was significantly modified to improve model performance. In my case usage of Mel Spectrogram as an input data preprocessor gave me up to 8% accuracy improvement (from 89% on the dev set, up to 97% on the dev set).

The model architecture was taken from Coursera final course in series about Deep Learning from Andrew Ng. The only difference is the amount of input and output nodes. Below you can find the complete architecture of model:

The dataset was generated with the help of Google Cloud Text-to-Speech. Using Wavenets I gathered about 50 examples of 1 second audio files with trigger word phrase. Background generation was created by cutting and converting Youtube videos such as this one. Half of the trigger words were recorded by myself in Jupyter Notebook scripts which will be posted in a separate directory. Labels were marked after every trigger word. Below are examples of such marking (the first one was taken from Coursera Sequence course, and the second one was created by myself):

From the images above you may see that for the trigger word I chose the phrase "Listen comrade". The model should make the distinct notification sound if the trigger word was spoken by someone.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
activation_sound		activation_sound
images		images
main		main
model_training		model_training
models		models
notes		notes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hotword Detection with Raspberry Pi

About / Synopsis

Table of contents

Installation

Usage

Screenshots

Code

Content

Requirements

License

About project

About

Releases

Packages

Languages

License

BioWar/Hotword-Detection-with-Raspberry-Pi

Folders and files

Latest commit

History

Repository files navigation

Hotword Detection with Raspberry Pi

About / Synopsis

Table of contents

Installation

Usage

Screenshots

Code

Content

Requirements

License

About project

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages