Click here to register.


Speech Recognition in the News

Click the 'Add' link to add a comment to this page.

AddSearch

Decoding radio talk in Uganda
By colbec - 12/4/2016

In Uganda they are listening to the radio with speech recognition to analyze local issues. They listen to broadcasts in local English pronunciation and native languages.

http://pulselabkampala.ug/radiomining/

From the article "We are focussing on the open source software HTK as a platform for [the speech recognition component]"

Apple jack ax ushers in a voice-driven world: Yahoo News
By colbec - 9/8/2016

https://www.yahoo.com/news/apple-jack-ax-ushers-voice-191944728.html

Mozilla's Vaani Voice of IOT
By kmaclean - 6/28/2016

Mozilla has pivoted Vaani to be the Voice of IOT.   Vaani was originally an "on-device" virtual assistant for FirefoxOS.  Now they have 3 new projects related to creating a virtual assistant for the Internet of Things:

DeepSpeech: an open source speech recognition engine.  It is based off of Baidu’s research and which will use Google's TensorFlow machine learning framework.  It’s currently in early development.

Pipsqueak: a longer term goal to create a new speech recognition engine that implements cutting edge technology to allow Vaani to work completely off-line while still allowing for the high quality speech recognition users have become used to.

Murmur: a simple webapp for collecting speech samples to train speech recognition engines.  They want to slowly build a speech corpus to train their open source models.

One thing to note, is that although they want to create their own speech corpus, for now they are planning to use a purchased speech corpus for their acoustic models.

Mycroft AI and OpenSST
By kmaclean - 6/17/2016

The Mycroft AI, Inc. has released an open source platform called Mycroft core that promises to allow users to "use natural language to control the Internet of Things".  the Mycroft framework also includes an intent parser called adapt and a TTS engine (based on CMU's Flite) called mimic.  For speech recognition they are currently using Google's cloud-based speech recognition service.

They've also created a reference hardware implementation based on Raspberry Pi and Arduino and have had successful kickstarter and indiegogo campaigns to raise funds.

Since their stated goal is to provide an open source alternative to the likes of Amazon echo, the Mycroft AI group has started a new initiative called OpenSST (an Open Source Speech To Text project) looking to create "open source speech-to-text models"... likely for Kaldi.

AlphaGO and voice
By colbec - 1/28/2016

In Toronto Star article http://www.thestar.com/news/world/2016/01/27/deepmind-computer-program-beats-humans-at-go.html

"As Hassabis told reporters, the same principles AlphaGo uses have many applications, from better digital personal assistants to improved medical diagnostics and far, far beyond. Because the algorithm is general-purpose, it could respond nimbly to complex information like voice instructions, for example. "

From general reading, it looks like AlphaGO uses two neural networks working together to prune the search space and evaluate the next move.

Speech Recognition cloud APIs
By kmaclean - 6/30/2015 - 4 Replies

New speech recognition cloud services:

HP: IDOL (Intelligent Data Operating Layer) Speech Recognition API

Amazon: Alexa Voice Service (AVS) 

NTT Com: SkyWay

more:

HPE HAVEN ondemand Speech Recognition

IBM Watson Dialog service. (github dialog tool)

qtspeech project
By kmaclean - 9/10/2015

Code-Q O is working on a Qt Speech Recognition API for Qt using Pocketsphinx.  Source repository.

Speech Recognition at Mozilla
By kmaclean - 8/18/2015 - 1 Replies

Looks like Mozilla is working on a speech recognition front end called vaani that will allow users to submitt speech in different languages directly from FireFox.  This is amazing news for open source speech recognition.

Kelly Davis says that they will make the speech corpus and acoustic models available by the end of this year (2015).

MOVI speech recognizer shield for Arduino
By kmaclean - 7/16/2015

MOVI (My Own Voice Interface) is an offline speech recognizer and voice synthesizer that adds voice control functionality to any Arduino project.

What is interesting is their approach to training the on-board acoustic model:

Training: MOVI’s Arduino API sends the training sentences in textual form over the serial connection to the shield. The shield phonetizes sentences using a 2GB dictionary. The phoneme sequences are used to create a temporal model that assigns higher probabilities to phonemes sequences that occurred in the trained sentences than to those that didn’t.

Given that they say they are using open source algorithms which they intend to provide when the shield is released, it will be interesting to see how they've implemented this.

Verbis Virtus: a game where you use your voice to cast spells
By kmaclean - 3/26/2015 - 1 Replies

Verbis Virtus, is a game by Indomitus Games (Italian game studio), where you use your voice to cast spells.

They use CMU Sphinx for speech recognition.

Are there any others?