VoxForge
sample.grammar is :
S : NS_B SENT NS_E
SENT: COM VAR
sample.voca
% NS_B
<s> sil
% NS_E
</s> sil
% COM
INPUT in pu t
OUTPUT au t pu t
% VAR
A a
B b
RUNNING THE SCRIPTS GIVE :
$ ./HTK_Compile_Model.sh
init
==============================================================
Step 1 - Task Grammar
==============================================================
already completed manually
Step 2 - Pronunciation Dictionnary
==============================================================
sorting:./interim_files/wlist to:./interim_files/wlist1
Found voxforge_lexicon
***Please review the following HDMan output***:
WARNING: no script file ./input_files/voxforge_lexicon.ded
Missing Words
-------------
*/sample1
5.41.15.1507;}viewkind4uc1pardf0fs20
A
ABALON
ABDOMINALS
ABOLISHpar
ABOUNDING
ABOUT
ACCOUNT
ACHIEVE
ACTUAL
ACUPUNCTURE
ADVENTUREpar
AEROBIC
AFGHAN
AGAINST
AGATHApar
AHEAD
AIRPLANE
ALGORITHM
ALLENTOWNpar
ALRIGHT
ALTHOUGH
ALTOGETHER
AMNESIA
AND
ANOTHERpar
ARTHRITIS
ASIA
AVERSION
Apar
B
BATTLE
BEATLE
BEIGE
BEIJINGpar
BELOW
BELT
BEND
BITTEN
BLATANT
BODY
BOOKENDS
BRIGHT
BRIGHTEN
BRITAINpar
BROOKHAVEN
BROUHAHA
BULLHEADSpar
BUSBOYS
Bpar
CALL
CHAMPLAIN
CHOICE
COILS
COINpar
COLLECTION
COLORATION
COMBINATION
COMMERCIALpar
COMPLAIN
COPYRIGHT
COUPLE
CRADLE
CRITERIA
CRUMBLEpar
CUBA
CUBE
CUMULATIVEpar
CURING
CURLING
CYCLINGpar
CYNTHIA
Courier
DANFORTH
DASHBOARD
DATABASEpar
DEPTHpar
DIGEST
DIGITAL
DILIGENTpar
ELAINE
EXPLAINpar
GATEWAY
GATORADE
GAZEBO
HELLO
HELMET
HELP
HELPLESS
HELPpar
HOOD
INPUT
INPUTpar
KENNEL
KENNETH
KENYA
LEISURE
LITTLE
MEASURE
MEASUREpar
METALpar
MIDDLE
Msftedit
NEEDLE
New;}}
OUTPUT
OUTPUTpar
PHONE
POODLE
RIGHTpar
SADDLEpar
SENT-END
SENT-START
STEVE
WEEKENDpar
YOUNG
YOUNGpar
par
Dictionary Usage Statistics
---------------------------
Dictionary TotalWords WordsUsed TotalProns PronsUsed
voxforge_lex 0 0 0 0
dict 0 0 0 0
119 words required, 119 missing
Dictionary ./interim_files/dict created
Step 3 - Recording the Data
==============================================================
already completed manually
Step 4 - Creating Transcription Files
==============================================================
writing to mlf file ./interim_files/words.mlf
writing to ./interim_files/words.mlf file done
ERROR [+1232] NumParts: Cannot find word Courier in dictionary
FATAL ERROR - Terminating program C:\my\cygwin\HTK\htk-3.3-windows-binary\htk\H
LEd.exe
ERROR [+1232] NumParts: Cannot find word Courier in dictionary
FATAL ERROR - Terminating program C:\my\cygwin\HTK\htk-3.3-windows-binary\htk\H
LEd.exe
HellRazorr@HellRazorrr ~/new/voxforge/auto/scripts
$
If you add words to the grammar, make sure they are in the pronunciation dictionary (i.e. see Step 2: voxforge_lexicon - if not, add them) and that they are all in caps - i.e. "Courier" is not in the dictionary, but "COURIER" is.
What is in your prompts file? Why is there "5.41.15.1507;}viewkind4uc1pardf0fs20" in your prompts file?
Basically the script is stopping at the point where is creates a wlist file based on you prompts file, and then it looks up each word in the wlist file in the VoxForge_lexicon pronunciation dictionary.
Since the HDMan command is saying that *all* the words in your prompts file are missing from the pronunciation dictionary, it seems like you are not using the VoxForge_lexicon file.
It might be best for you to follow the Howto example exactly and get that to work, and then deviate from it in small increments, to catch any errors.
Ken
First of all thank so so much for replying again ..
I forgot to include the prompts file in the query .. here it is .
*/sample1 INPUT OUTPUT INPUT OUTPUT
*/sample2 INPUT A B A B A B OUTPUT A B
*/sample3 INPUT A A A A B B B B B
*/sample4 OUTPUT OUTPUT B B B B A A
*/sample5 INPUT OUTPUT OUTPUT INPUT INPUT INPUT
*/sample6 PHONE STEVE YOUNG CALL STEVE YOUNG
*/sample7 PHONE STEVE CALL STEVE PHONE YOUNG CALL YOUNG
*/sample8 PHONE PHONE STEVE STEVE CALL CALL YOUNG YOUNG
*/sample9 MEASURE LEISURE AND LEISURE MEASURE
*/sample10 COMPLAIN CHAMPLAIN AIRPLANE ELAINE EXPLAIN
*/sample11 BOOKENDS KENNEL KENNETH KENYA WEEKEND
*/sample12 BELT BELOW BEND AEROBIC DASHBOARD DATABASE
*/sample13 GATEWAY GATORADE GAZEBO AFGHAN AGAINST AGATHA
*/sample14 ABALON ABDOMINALS BODY ABOLISH
*/sample15 ABOUNDING ABOUT ACCOUNT ALLENTOWN
*/sample16 ACHIEVE ACTUAL ACUPUNCTURE ADVENTURE
*/sample17 ALGORITHM ALTHOUGH ALTOGETHER ANOTHER
*/sample18 BATTLE BEATLE LITTLE METAL
*/sample19 BITTEN BLATANT BRIGHTEN BRITAIN
*/sample20 BROOKHAVEN HOOD BROUHAHA BULLHEADS
*/sample21 BUSBOYS CHOICE COILS COIN
*/sample22 COLLECTION COLORATION COMBINATION COMMERCIAL
*/sample23 MIDDLE NEEDLE POODLE SADDLE
*/sample24 ALRIGHT ARTHRITIS BRIGHT COPYRIGHT CRITERIA RIGHT
*/sample25 COUPLE CRADLE CRUMBLE
*/sample26 CUBA CUBE CUMULATIVE
*/sample27 CURING CURLING CYCLING
*/sample28 CYNTHIA DANFORTH DEPTH
*/sample29 DIGEST DIGITAL DILIGENT
*/sample30 AMNESIA ASIA AVERSION BEIGE BEIJING
*/sample31 HELP HELLO HELMET HELPLESS AHEAD HELP
Secondly,
Regarding the 'Courier' word ... I have not used it in my sample.voca anywhere .. I don't know why is it there when i run the HTK_Compile_Model.sh ...
Please do the needful .
Thanks again ..
Really appreciated..
it seems yes .. i am using Cygwin in Windows to run the script .. and I am following ths link ..
http://www.voxforge.org/home/dev/acousticmodels/linux/create/htkjulius/how-to/script
Please gudie me ..
I suppose that the steps reamin essentailly the same in both the platform , and also regarding the font thing .. when i make the codetrain.scp it comes automatically in the FONT= Courier ...
Rest files are unidentified by windows but this codetrain.scp takes the .txt icon !
You want: http://www.voxforge.org/home/dev/acousticmodels/windows/create/htkjulius/how-to.
Even though you are running Cygwin, Windows does some weird things with line endings that can cause problems.
With respect to windows .. what should I change.. I mean i tried working with HTK_Compile_Model.sh from the windows link .,. but it was not a success .. still same error with Courier word ...
Is it like the scripts only run in ubuntu correctly ??
should I use dos2unix.exe *.* . wil it help .. I read it in the same How To section .. it reomves some Lexical Error ..
>Is it like the scripts only run in ubuntu correctly ??
It should run under Windows, but I only tested with Win XP, and that was long ago... have not tried it with Vista.
>should I use dos2unix.exe
It seems like you are having character encoding problems in our prompts file. Did you end up trying dos2unix or unix2dos?
Ke