Speaking with devices (and making them answer or do something) seems to be a trend of the time. Some up-to-date smartphones and tablets allow allow to use speech to trigger internet searches, to write short messages or e-mails (sometimes with funny results), to ask for something in the region, to turn the light of the smartphone on, … .
In addition to voice control on smartphones, well-known companies started to launch devices to enable voice control @home. However, the commercial solutions perform speech recognition on their own servers. AFAIK due to computing power requirements of the AI behind.
In this case one has to live with the fact that a constant internet connection is inevitable and that own voice samples are uploaded somewhere else for analysis.
Still, speech control can be extremely useful. My favourite example for illustration is setting a timer while being busy with something else.
Though, in a smart home there are many more applications for speech control: light, heating, media, … . Even for elderly, handicapped or visually impaired controlling everyday procedures by the own voice can be a huge advantage in the daily life.
Open Source Solution: Jasper
The open source solution for voice control, Jasper, offers the possibility to work offline, but the setup of the software is not trivial. It looks like the manual is outdated. Some experimental, but required libraries are not to be found easily anymore. This is why I turned to the API of a commercial solution to play with speech recognition on my Raspberry Pi 3.
At the moment speech recognition devices such as Amazon’s Alexa are not sold everywhere yet. It is possible to order them in Europe, but they are not shipped yet. As rumour has it: regions in which stronger accents are spoken are served first. 🙂
The voice service that is used by Amazon’s Alexa devices can be relatively easy tested on a Raspberry Pi 3. Since a couple of weeks wake word detection in this solution is possible on the Raspberry Pi 3 as well.
Amazon Developer Account Settings
An Amazon developer account is required for using the voice service. The registration is free. After the registration an Alexa device has to be created along with security and web settings. On this page the required steps are explained. Save the client ID and secret for later.
This github project contains the required installation software for download:
git clone https://github.com/alexa/alexa-avs-sample-app.git
The setup of the software is performed running the automated_install shell script. It has to be completed with the product name, client ID and secret. The script guides through the configuration and setup.
After successful installation the companion service, the AVS client and the desired wake word agent have to be launched in three separate terminals.
The AVS client requires authorization by signing in using the Amazon developer account. On request the default browser is opened and Alexa is ready to listen in after the confirmation.
On the Raspberry Pi Alexa starts to listen more closely either on the push of a button or by hearing the wake word ‚Alexa‘. It confirms with a sound that it is listening. The next spoken words (shoud be english) are going to be analyzed. A longer break between words marks the end of the sentence.
Alexa’s answers are returned quickly! Out of the box it is possible to ask for the current weather at a specific location, to ask for a joke, to convert unities, to look up something in wikipedia, etc . Alexa can be connected to a calendar, it can calculate and it knows its „birthday“ (being the day it was first sold). That’s not all…
Surprising was my low-cost microphone in combination with Alexa. The first tests on various operating systems were devastating: I had to speak from a distance of 1 cm to be heard at all. Independent of the recording settings. I thought it is also some kind of safety precaution if I had to be close to the microphone to use speech recognition …but Alexa immediately worked from a distance of 2 m as well. It felt a bit slower, though, but still, it worked…
When I played the video recorded of my running system telling a joke it just started itself again when hearing the wake word from the video! It has already been shown that infinite loops of voice control can be set up easily: https://www.youtube.com/watch?v=ZfCfTYZJWtI . Alexa might also react on its wake word spoken on TV as recently learnt from the Verge’s doll house article!
Alexa is extensible with custom skills for own applications. Perhaps this is the thing to try next.