VoxForge
Hey guys!
(Great project; Cool karma etc. etc!!!)
We are a couple of guys who lives in Denmark, however our spoken English is [reasonably] correct, so naturally we want to help out in a project that benefits pretty much everyone.
So the question is: What dialect do we report in the README?!?
Keep up the good hacking!
/macavity
--
FSF Associate member number 3423.
--- (Edited on 10/11/2006 8:53 am [GMT-0500] by Visitor) ---
The real purpose of categorizing speech by dialect is so we can
created specialized Acoustic Models targeted to certain dialects or
regions. We do this to reduce the size of the Acoustic Model, and
to improve recognition accuracy because there is less variation in the
sounds that the Acoustic Model was trained to recognize.
In your case, I think we need to make a distinction between 'dialect' and 'accent'. From Wikipedia, a dialect is a variety of language differing in vocabulary and grammar as well as pronunciation. Dialects are usually spoken by a group united by geography or class.
An accent may be any pronunciation that deviates from a standard language and pronunciation are defined by a group. Groups sharing an identifiable accent may be defined by any of a wide variety of common traits. An accent may be associated with the region in which its speakers reside (a geographical accent), the socio-economic status of its speakers, their ethnicity, their caste or social class, their first language (relative to the person hearing the accent - i.e. you may think I have an accent, and I may think you have an accent ...), and so on.
Correct me if I am wrong, but in your case, I think you speak in the European English dialect (the standard language in this case) with a Danish Accent. So when we are looking at creating an Acoustic Model for English speakers in Europe, we would include your voice because it is representative of the European English dialect. This will likely change as we get feedback on the performance of such an acoustic model.
I will update the docs to reflect this.
Hope that clarifies things, and thanks for the input.
Ken
--- (Edited on 10/11/2006 11:22 am [GMT-0400] by kmaclean) ---
Thank you very much! That was exactly the info I needed :-)
I have just made arangements to use a poor-man's recording facility with real microphones over this weekend.
Denmark is a very small country (5,500,000 people), so English is pretty much the standard communication with the outside world. Therefore I recon that, as a longterm goal, it would be very helpfull to have a dedicated "Dane speaking English" Acoustic Model. If the data for such a model can also be used for the "Any European speaking English" model, that is just a win-win situation for everyone. I have plenty of male speakers ready, but completely lack female ones... So I guess I need to come up with some kind of new strategy ;-)
Keep the compilers running!
/macavity
--
FSF Associate member number 3423.
--- (Edited on 10/12/2006 6:29 am [GMT-0500] by macavity) ---