Click here to register.

General Discussion

User: dano
Date: 3/11/2008 11:14 am
Views: 4391
Rating: 26
Feature request: use Ekiga for collecting/contributing speech.

--- (Edited on 3/11/2008 11:14 am [GMT-0500] by dano) ---

Ekiga could be an additional way.
User: ralfherzog
Date: 3/12/2008 12:52 am
Views: 296
Rating: 25
Using Ekiga to collect speech sounds like a very good idea. VoxForge currently offers submission via telephone (does anyone use this feature?), the speech submission application (should support much more prompts for each language), the upload of zipped speech files (I prefer this method).  Ekiga could be an additional way.

--- (Edited on 2008-03-12 12:53 am [GMT-0500] by ralfherzog) ---

Re: Ekiga
User: kmaclean
Date: 3/12/2008 2:29 pm
Views: 233
Rating: 27

> Feature request: use Ekiga for collecting/contributing speech.

From Ekiga site:

Ekiga uses both the H.323 and SIP protocols. It supports many audio and video codecs, and is interoperable with other SIP compliant software [...]

So we should be able to use Ekiga as a SIP client front-end for the VoxForgeIVR application (Asterisk-based app) that trevarthen created. 

The main problem is that our current server resources would likely not be able to support the bandwidth required of a VoIP solution.  Where the web front-end server is located, there is not enough bandwidth to support VoIP (I've tried direct connections with people, and have been dropped many times).  On the VoxForge repository server (which is just a regular web-hosting service), we don't have root access, which (I think) is required for an Asterisk server.  If there is a way around this, please let me know.



--- (Edited on 3/12/2008 3:29 pm [GMT-0400] by kmaclean) ---

Re: Ekiga
User: nsh
Date: 3/12/2008 3:03 pm
Views: 233
Rating: 29

> If there is a way around this, please let me know.

Of course you can build * in a custom prefix from sources. The only difference is that you should configure it with all dirs:

./configure --prefix=... --sysconfdir=... --datadir=... and

probably some more options. There are different problems with firewall but it should work fine I suppose.

And SIP is a very nice idea of course.

--- (Edited on 3/12/2008 3:04 pm [GMT-0500] by nsh) ---

Re: Ekiga
User: kmaclean
Date: 3/14/2008 7:56 pm
Views: 188
Rating: 16

Hi nsh,

That seems sooo obvious now ... never even thought of doing that!

Might be a good SoC project,


--- (Edited on 3/14/2008 8:56 pm [GMT-0400] by kmaclean) ---

Re: Ekiga
User: colbec
Date: 3/15/2008 3:59 pm
Views: 433
Rating: 23
I frequently use Zoiper with an asterisk server over a dialup connection (56K theoretical) and I know there are a lot of blips burps and silences which could affect the quality of the recording. VOIP uses UDP which opens up the possibility that even over a good line there may be occasional silences in the middle of a recording, so one thing to think about is the quality of a 'patched' recording compared to one obtained in more constant conditions.

--- (Edited on 3/15/2008 3:59 pm [GMT-0500] by colbec) ---

Voice sample/model via VoIP
User: freakalad
Date: 3/2/2009 7:26 pm
Views: 159
Rating: 6

I'd be happy to lend my & my (willing) clients voices to contributing to this project via VoIP: easiest would be to implement a plug-in on either the VoIP gateway (Asterisk, FreeSwitch) or on the client (Ekiga).

Maybe talk to the respective developers to have it implemented as a feature, since they would be able to directly benefit from the project (i.e live voice translations, verbal commands to dial from a list/emergencies as opposed to DDI's, access control, etc)

Wide range of sampling can be gathered from call-centre implementations

--- (Edited on 3/2/2009 7:26 pm [GMT-0600] by Visitor) ---

Re: Voice sample/model via VoIP
User: kmaclean
Date: 3/5/2009 8:55 pm
Views: 1110
Rating: 5

Hi freakalad,

>I'd be happy to lend my & my (willing) clients voices to contributing to this

>project via VoIP

Our current approach is to collect high quality speech using the Java applet (or some another audio editor like Audacity) and then downsample it to the target rate used by the application (telephony uses 8kHz:8bit audio, ...).

For certain applications, like VoIP, there would be an additional step that would likely involve passing the speech audio through the target VoIP codec, and then training acoustic models using this transformed speech audio (for more info on this approach see my post in this thread, and David Gelbart's post in the same thread).


--- (Edited on 3/5/2009 9:55 pm [GMT-0500] by kmaclean) ---