Click here to register.

General Discussion

Flat
FreeCLAS - "Free Commons of Linguistically Annotated Speech".
User: kmaclean
Date: 9/4/2008 12:33 pm
Views: 251
Rating: 4    Rate [
]

From the a comp.speech.research post:

FreeCLAS (http://www.ihear.com/FreeCLAS) is a new project to build a a
data base of high-quality speech data. "High quality" means annotated
data that have been validated by humans. Building such a data base has
been expensive because it requires substantial investment of people's
attention. As a result, high-quality speech data is not generally
available.

FreeCLAS uses a wiki. This is a call for people to join the wiki to
build it. Embedded in the wiki is a tool, shva, which opens from your
browser to let you hear, view and annotate any utterance in FreeCLAS.
At this point, there is a seed data base of a small collection of
utterances annotated in en-US and IPA.

shva and other related software downloadable from FreeCLAS are all
Free Software, licensed under GPL or other compatible licenses. The
speech data is under the Creative Commons attribute-share-alike
license.

Their focus seems to be more collecting linguistic annotations of speech by getting users to provide/validate time stamps of utterances.  This is a little different what VoxForge is doing.  We are basically trying to collect speech prompts (15-20 words long), with little regard for accurate timings - since the HTK/SPhinx acoustic model training process can do this automatically (with short utterances)

What is really interesting (from VoxForge standpoint at least) is their ALingA (GPLv3) annotation Java applet.  I can't get the app the run on my PC (I have a 64-bit machine, which they don't provide support for...yet).  However, from the screen shots, it looks very impressive for a Java applet.  They use the JavaFX libraries, which is Sun's answer to creating rich Internet applications (RIAs)... i.e. Sun approach to creating a Flash-like environment.  It might be a useful starting point for a speech submission annotation validator for VoxForge (but just to allow other users to validate that an utterance matches the prompt line).

Ken

--- (Edited on 9/4/2008 1:33 pm [GMT-0400] by kmaclean) ---

Reply
Re: FreeCLAS - "Free Commons of Linguistically Annotated Speech".
User: Visitor
Date: 9/4/2008 4:21 pm
Views: 88
Rating: 1    Rate [
]
Cool!

--- (Edited on 9/4/2008 4:21 pm [GMT-0500] by Visitor) ---

Reply
PreviousNextAdd