Voice recognition accuracy problems with the long can speak tone and news for version 2.0.0

Recently, I found that the “long” can speak tone could reduce the accuracy of Google Voice Recognizer when using Voice Control in speaker-phone mode. This because the recognizer engined doesn't starts to listen immediately when Voice Control asks for it, but it starts with a delay of some seconds that the engine uses to calibrate itself on the background noise in order to improve its accuracy.
The problem with the “long” can speak tone is that it's started when the engine is asked to start to listen, and the tone is stopped when the engine is finally ready, so the tone is eared while the engine is calibrating and is interpreted as an high background noise, reducing the overall accuracy of the recognition.

If you have any problem with overall accuracy of the recognizer, try to use the “short” can speak tone.
However, be aware that the “short” tone is played when the recognizer notify that it's ready to listen, and despite Voice Control tries to mute the microphone while playing the tone, it may be still eared by the recognizer causing the end of the listening phase (this is caused by some unavoidable deep hardware restrictions on some device that prevent the microphone to being muted).

In a future version I will try to add an option to disable the can speak tone with some alternative notification methods when using Voice Control in speaker-phone mode (like some screen flashing).

Now some news about the 2.0.0 status!
I'm quite happy to say that the single sentence for complex actions feature is almost done and it's working nicely with send message and add event actions. I need only to add a forced confirmation request when the dictated action contains some free text parts (like a message text, an event title and so on), in order to ensure a correct interpretation of those parts.

However, while coding this feature, I also found that some times the voice recognizer interprets dictated numbers in a textual forms (it return the text “twenty” instead of “20”). This causes some problems with the add event action, that is unable to parse times with such text. I tried to handle such situation with next release, so this problems should be fixed.


Leave a Reply

Your email address will not be published. Required fields are marked *