Speech Recognition in the News

Click the 'Add' link to add a comment to this page.

Note: You need to be logged in to add a comment!

Search

Mycroft AI and OpenSST
By kmaclean - 6/17/2016

The Mycroft AI, Inc. has released an open source platform called Mycroft core that promises to allow users to "use natural language to control the Internet of Things".  the Mycroft framework also includes an intent parser called adapt and a TTS engine (based on CMU's Flite) called mimic.  For speech recognition they are currently using Google's cloud-based speech recognition service.

They've also created a reference hardware implementation based on Raspberry Pi and Arduino and have had successful kickstarter and indiegogo campaigns to raise funds.

Since their stated goal is to provide an open source alternative to the likes of Amazon echo, the Mycroft AI group has started a new initiative called OpenSST (an Open Source Speech To Text project) looking to create "open source speech-to-text models"... likely for Kaldi.

AlphaGO and voice
By colbec - 1/28/2016

In Toronto Star article http://www.thestar.com/news/world/2016/01/27/deepmind-computer-program-beats-humans-at-go.html

"As Hassabis told reporters, the same principles AlphaGo uses have many applications, from better digital personal assistants to improved medical diagnostics and far, far beyond. Because the algorithm is general-purpose, it could respond nimbly to complex information like voice instructions, for example. "

From general reading, it looks like AlphaGO uses two neural networks working together to prune the search space and evaluate the next move.

Speech Recognition cloud APIs
By kmaclean - 6/30/2015 - 6 Replies

New speech recognition cloud services:

HP: IDOL (Intelligent Data Operating Layer) Speech Recognition API

Amazon: Alexa Voice Service (AVS) 

NTT Com: SkyWay

more:

HPE HAVEN ondemand Speech Recognition

IBM Watson Dialog service. (github dialog tool)

qtspeech project
By kmaclean - 9/10/2015

Code-Q O is working on a Qt Speech Recognition API for Qt using Pocketsphinx.  Source repository.

Speech Recognition at Mozilla
By kmaclean - 8/18/2015 - 1 Replies

Looks like Mozilla is working on a speech recognition front end called vaani that will allow users to submitt speech in different languages directly from FireFox.  This is amazing news for open source speech recognition.

Kelly Davis says that they will make the speech corpus and acoustic models available by the end of this year (2015).

MOVI speech recognizer shield for Arduino
By kmaclean - 7/16/2015

MOVI (My Own Voice Interface) is an offline speech recognizer and voice synthesizer that adds voice control functionality to any Arduino project.

What is interesting is their approach to training the on-board acoustic model:

Training: MOVI’s Arduino API sends the training sentences in textual form over the serial connection to the shield. The shield phonetizes sentences using a 2GB dictionary. The phoneme sequences are used to create a temporal model that assigns higher probabilities to phonemes sequences that occurred in the trained sentences than to those that didn’t.

Given that they say they are using open source algorithms which they intend to provide when the shield is released, it will be interesting to see how they've implemented this.

Verbis Virtus: a game where you use your voice to cast spells
By kmaclean - 3/26/2015 - 1 Replies

Verbis Virtus, is a game by Indomitus Games (Italian game studio), where you use your voice to cast spells.

They use CMU Sphinx for speech recognition.

Are there any others?

ILA personal digital voice assistant
By Floriq - 3/20/2015

A personal digital voice assistant based on Sphinx-4 (also supports Google and Pocketsphinx). It's offline (if you want) highly customizable and you can teach it(/her/him ^^) new commands. Besides that it comes with a nice GUI and runs on Linux, MAC and Windows.

https://sites.google.com/site/ilavoiceassistant/


Smile cu,

Florian

Sirius: An Open Intelligent Personal Assistant
By kmaclean - 3/16/2015

from the Sirius site:

Sirius is an open end-to-end standalone speech and vision based intelligent personal assistant (IPA) service similar to Apple’s Siri, Google’s Google Now, Microsoft’s Cortana, and Amazon’s Echo. Sirius implements the core functionalities of an IPA including speech recognition, image matching, natural language processing and a question-and-answer system.

It can be run using Sphinx (sphinxbase and pocketsphinx) or Kaldi.

check it out on GitHub

IBM True North chips and voice
By colbec - 8/18/2014

From http://www.washingtonpost.com/news/speaking-of-science/wp/2014/08/07/ibm-announces-the-most-brain-like-computer-chip-to-date/

"The chip, IBM researchers wrote, will help computers handle tasks such as image and voice recognition with the alacrity of humans."