User:
prithviraj
Date: 3/29/2018 1:17 am
Views: 2224
Rating: 0
Hi,
I am working for Odia language.I followed the step-2 of VoxForge tutorial generating dict file and monophones1 file.I have created my own lexicon file.While I am running with the command given in the tutorial it is not showing any error.But the content in dict file and monophones1 file is not in odia lnguage.
voxforge_lexicon.txt
SENT-END [] sil
SENT-START [] sil
à¬à¬ [à¬à¬ ] ଠଠà à¬
à¬à¬ [à¬à¬] ଠà¬à à¬
à¬à¬¾à¬°à¬¿ [à¬à¬¾à¬°à¬¿] ଠଠରà à¬
à¬à¬
[à¬à¬
] à¬à à¬
à¬
ତିନି [ତିନି] ତà ଠନà à¬
ଦàଠ[ଦàà¬] ଦà ଠà¬
ନà¬
[ନà¬
] ନà à¬
à¬
ପାà¬àଠ[ପାà¬àà¬] ପà ଠଠà¬à à¬
ଶàନ [ଶàନ] ଶà ଠନà à¬
ସାତ [ସାତ] ସà ଠତà à¬
wlist
SENT-END
SENT-START
à¬à¬
à¬à¬
à¬à¬¾à¬°à¬¿
à¬à¬
ତିନି
ଦàà¬
ନà¬
ପାà¬àà¬
ଶàନ
ସାତ
I used the command
HDMan -A -D -T 1 -m -w wlist -n monophones1 -i -l dlog dict ../lexicon/voxforge_lexicon.txt
The content of dict file what I optained is
SENT-END [] sil
SENT-START [] sil
\340\254\206\340\254\240 [\340\254\206\340\254\240] \340\254\206 \340\254\240\340\255\215 \340\254\205 sp
\340\254\217\340\254\225 [\340\254\217\340\254\225] \340\254\217 \340\254\225\340\255\215 \340\254\205 sp
\340\254\232\340\254\276\340\254\260\340\254\277 [\340\254\232\340\254\276\340\254\260\340\254\277] \340\254\232 \340\254\206 \340\254\260\340\255\215 \340\254\207 sp
\340\254\233\340\254\205 [\340\254\233\340\254\205] \340\254\233\340\255\215 \340\254\205 \340\254\205 sp
\340\254\244\340\254\277\340\254\250\340\254\277 [\340\254\244\340\254\277\340\254\250\340\254\277] \340\254\244\340\255\215 \340\254\207 \340\254\250\340\255\215 \340\254\207 sp
\340\254\246\340\255\201\340\254\207 [\340\254\246\340\255\201\340\254\207] \340\254\246\340\255\215 \340\254\211 \340\254\207 sp
\340\254\250\340\254\205 [\340\254\250\340\254\205] \340\254\250\340\255\215 \340\254\205 \340\254\205 sp
\340\254\252\340\254\276\340\254\236\340\255\215\340\254\232 [\340\254\252\340\254\276\340\254\236\340\255\215\340\254\232] \340\254\252\340\255\215 \340\254\206 \340\254\236 \340\254\232\340\255\215 \340\254\205 sp
\340\254\266\340\255\202\340\254\250 [\340\254\266\340\255\202\340\254\250] \340\254\266\340\255\215 \340\254\212 \340\254\250\340\255\215 \340\254\205 sp
\340\254\270\340\254\276\340\254\244 [\340\254\270\340\254\276\340\254\244] \340\254\270\340\255\215 \340\254\206 \340\254\244\340\255\215 \340\254\205 sp
and
monophones1 file is
sil
\340\254\206
\340\254\240\340\255\215
\340\254\205
sp
\340\254\217
\340\254\225\340\255\215
\340\254\232
\340\254\260\340\255\215
\340\254\207
\340\254\233\340\255\215
\340\254\244\340\255\215
\340\254\250\340\255\215
\340\254\246\340\255\215
\340\254\211
\340\254\252\340\255\215
\340\254\236
\340\254\232\340\255\215
\340\254\266\340\255\215
\340\254\212
\340\254\270\340\255\215
content of dlog file
WARNING: no script file ../lexicon/voxforge_lexicon.txt.ded
Dictionary Usage Statistics
---------------------------
Dictionary TotalWords WordsUsed TotalProns PronsUsed
voxforge_lex 12 12 12 12
dict 12 12 12 12
12 words required, 0 missing
New Phone Usage Counts
---------------------
1. sil : 2
2. ଠ: 4
3. ଠà : 1
4. à¬
: 9
5. sp : 10
6. ଠ: 1
7. à¬à : 1
8. ଠ: 1
9. ରà : 1
10. ଠ: 4
11. à¬à : 1
12. ତà : 2
13. ନà : 3
14. ଦà : 1
15. ଠ: 1
16. ପà : 1
17. ଠ: 1
18. à¬à : 1
19. ଶà : 1
20. ଠ: 1
21. ସà : 1
Dictionary dict created
I have set global.ded file as it is there in the tutorial.
I am using notepad++ which supports unicode format for writing these odia words.I think it may be the problem with unicode supprot.So plz suggest.
Prithviraj