Comments

Flat
How to fix with the unicode text while finding the sample.dict file
User: prithviraj
Date: 3/13/2018 4:23 am
Views: 2073
Rating: 0

Hi,

I doing this for Odia language recognition.In the Step-1, I have taken the sample.grammar file and sample.voca file(contains some Odia words with it's phones).I executed with the mkdfa.jl file.It generated the corresponding sample.dfa, sample.term and sample.dict.But I am not finding any thing readable from sample.dict file(only some "?" symbols are there ).I think it might be the problem of unicode system.

So, How to fix this problem?

Thanks in advance.

Prithviraj

 

Re: How to fix with the unicode text while finding the sample.dict file
User: kmaclean
Date: 3/13/2018 8:23 am
Views: 0
Rating: 0

>I think it might be the problem of unicode system.

Julius was originally written for Japanese speech recognition, so it should work...

Try building the dictionary with the Julius version of the same program (in perl): mkdfa.pl

 

Re: How to fix with the unicode text while finding the sample.dict file
User: prithviraj
Date: 3/14/2018 6:06 am
Views: 2
Rating: 0

It (mkdfa.pl) works fine for the sample.grammar and sample.voca given in the tutorial.Not working for our modified sample.grammar and sample.voca file.

sample.grammar

S : NS_B SENT NS_E

SENT: DIGIT

sample.voca
% NS_B
<s>        sil
% NS_E
</s>        sil
% DIGIT
 ଆଠ         ଆ ଠ୍ ଅ
 ଏକ        ଏ କ୍ ଅ
 ଚାରି         ଚ ଆ ର୍ ଇ
 ଛଅ        ଛ୍ ଅ ଅ
 ତିନି       ତ୍ ଇ ନ୍ ଇ
 ଦୁଇ       ଦ୍ ଉ ଇ
 ନଅ       ନ୍ ଅ ଅ
 ପାଞ୍ଚ      ପ୍ ଆ ଞ ଚ୍ ଅ
 ଶୂନ      ଶ୍ ଊ ନ୍ ଅ
 ସାତ      ସ୍ ଆ ତ୍ ଅ
sample1.grammar has 2 rules
sample1.voca    has 2 categories and 13 words
---
Warning: dfa_minimize not found in the same place as mkdfa.p
Warning: no minimization performed
Now parsing grammar file
Now modifying grammar to minimize states[-1]
Now parsing vocabulary file
Now making nondeterministic finite automaton[1/1]
Error:       undefined class "NS_B"
---
no .dfa or .dict file generated
How to fix this problem.
Thanks in advance.
Prithviraj
                           
Re: How to fix with the unicode text while finding the sample.dict file
User: kmaclean
Date: 3/14/2018 7:23 am
Views: 1
Rating: 0

>How to fix this problem.

You might try compiling Julius from source and see if the mkfa program (written in c and called by mkdfa.pl) in the gramtools folder will process the default character encodings of your system.

Re: How to fix with the unicode text while finding the sample.dict file
User: prithviraj
Date: 3/15/2018 12:42 am
Views: 1
Rating: 0

Thanks Kmaclean.

I edited the grammar and voca file in Notepad++ which supports Unicode format.Hence using mkdfa.pl/mkdf.jl, I am able to generate the corresponding dict,dfa and term files.

Prithviraj

PreviousNext