VoxForge
Sphinx4 can use a sphinx3 acoustic model. I order to use one, as far as I understand, it is necessary to repack the acoustic model with a new directory layout. So build.xml does this function - moves files to their new places, compiles necessary accessor classes and creates a jar file with the stuff.
If someone wants to use sphinx4 as an ASR engine, then they will have to somehow convert AcousticModels.tgz to jar format of sphinx4.
I've got both - an ant script build.xml and the result of the transformation - msu_ru_nsh_8gau_13dCep_16k_40mel_130Hz_6800Hz.jar.
I think these can be placed in one of the following ways:
> it is necessary to repack the acoustic model with a new directory layout
It isn't. You can just point to the model data in config file:
<component name="model" type="edu.cmu.sphinx.linguist.acoustic.tiedstate.TiedStateAcousticModel">
<property name="loader" value="sphinx3Loader"/>
<property name="unitManager" value="unitManager"/>
</component>
<component name="sphinx3Loader" type="edu.cmu.sphinx.linguist.acoustic.tiedstate.Sphinx3Loader">
<property name="logMath" value="logMath"/>
<property name="unitManager" value="unitManager"/>
<property name="dataLocation" value="file:model"/>
<property name="modelDefinition" value="file:model/mdef"/>
</component>
> I think these can be placed in one of the following ways
Instead sphinx4 documentation should be updated
Well. It is really surpising. I'd found no key in sphinx4 documentation describing this method.
However, this method, I guess, requires a user to unpack AcousticModel to some folder and use files directly. One can find it more convenient to use suggested sphinx4 format because it offers better modularity and maintainability.
Official cmusphinx docs showing how to accomplish this: How to Use Models from SphinxTrain in Sphinx-4