Comments

Flat
Warning: Too long input(>320000 samples)
User: Sudipto
Date: 10/1/2016 2:00 pm
Views: 3004
Rating: 1

when it is <<please speak>> I try to record "phone steve".I don't know what is happening.it just shows the warning "Too long input(>320000 samples)" in the output.Then again <<please speak>> message are shown.

 

 

STAT: jconf successfully finalized

STAT: *** loading AM00 _default

Stat: init_phmm: Reading in HMM definition

Stat: rdhmmdef: ascii format HMM definition

Stat: rdhmmdef: limit check passed

Stat: check_hmm_restriction: an HMM with several arcs from initial state found: "sp"

Stat: rdhmmdef: this HMM requires multipath handling at decoding

Stat: rdhmmdef: no <SID> embedded

Stat: rdhmmdef: assign SID by the order of appearance

Stat: init_phmm: defined HMMs:   810

Stat: init_phmm: loading ascii hmmlist

Stat: init_phmm: logical names: 24402 in HMMList

Stat: init_phmm: base phones:    41 used in logical

Stat: init_phmm: finished reading HMM definitions

STAT: m_fusion: force multipath HMM handling by user request

STAT: making pseudo bi/mono-phone for IW-triphone

Stat: hmm_lookup: 799 pseudo phones are added to logical HMM list

STAT: *** AM00 _default loaded

STAT: *** loading LM00 _default

STAT: reading [sample.dfa] and [sample.dict]...

Stat: init_voca: read 18 words

STAT: done

STAT: Gram #0 sample registered

STAT: Gram #0 sample: new grammar loaded, now mash it up for recognition

STAT: Gram #0 sample: extracting category-pair constraint for the 1st pass

STAT: Gram #0 sample: installed

STAT: Gram #0 sample: turn on active

STAT: grammar update completed

STAT: *** LM00 _default loaded

STAT: ------

STAT: All models are ready, go for final fusion

STAT: [1] create MFCC extraction instance(s)

STAT: *** create MFCC calculation modules from AM

STAT: AM 0 _default: create a new module MFCC01

STAT: 1 MFCC modules created

STAT: [2] create recognition processing instance(s) with AM and LM

STAT: composing recognizer instance SR00 _default (AM00 _default, LM00 _default)

STAT: Building HMM lexicon tree

STAT: lexicon size: 207 nodes

STAT: coordination check passed

STAT: multi-gram: beam width set to 200 (guess) by lexicon change

STAT: wchmm (re)build completed

STAT: SR00 _default composed

STAT: [3] initialize for acoustic HMM calculation

Stat: outprob_init: state-level mixture PDFs, use calc_mix()

Stat: addlog: generating addlog table (size = 1953 kB)

Stat: addlog: addlog table generated

STAT: [4] prepare MFCC storage(s)

STAT: [5] prepare for real-time decoding

STAT: All init successfully done

 

STAT: ###### initialize input device

----------------------- System Information begin ---------------------

JuliusLib rev.4.3.1 (fast)

 

Engine specification:

 -  Base setup   : fast

 -  Supported LM : DFA, N-gram, Word

 -  Extension    :

 -  Compiled by  : gcc -g -O2

 

------------------------------------------------------------

Configuration of Modules

 

 Number of defined modules: AM=1, LM=1, SR=1

 

 Acoustic Model (with input parameter spec.):

 - AM00 "_default"

hmmfilename=hmm15/hmmdefs

hmmmapfilename=tiedlist

 

 Language Model:

 - LM00 "_default"

grammar #1:

   dfa  = sample.dfa

   dict = sample.dict

 

 Recognizer:

 - SR00 "_default" (AM00, LM00)

 

------------------------------------------------------------

Speech Analysis Module(s)

 

[MFCC01]  for [AM00 _default]

 

 Acoustic analysis condition:

      parameter = MFCC_0_D_N_Z (25 dim. from 12 cepstrum + c0, abs energy supressed with CMN)

sample frequency = 16000 Hz

  sample period =  625  (1 = 100ns)

    window size =  400 samples (25.0 ms)

    frame shift =  160 samples (10.0 ms)

   pre-emphasis = 0.97

   # filterbank = 24

  cepst. lifter = 22

     raw energy = False

energy normalize = False

   delta window = 2 frames (20.0 ms) around

    hi freq cut = OFF

    lo freq cut = OFF

zero mean frame = OFF

      use power = OFF

            CVN = OFF

           VTLN = OFF

 

    spectral subtraction = off

 

 cep. mean normalization = yes, real-time MAP-CMN, updating mean with last 0.0 sec. input

  initial mean from file = N/A

   beginning data weight = 100.00

 cep. var. normalization = no

 

base setup from = Julius defaults

 

------------------------------------------------------------

Acoustic Model(s)

 

[AM00 "_default"]

 

 HMM Info:

    810 models, 126 states, 126 mpdfs, 126 Gaussians are defined

     model type = context dependency handling ON

      training parameter = MFCC_N_D_Z_0

  vector length = 25

number of stream = 1

    stream info = [0-24]

cov. matrix type = DIAGC

  duration type = NULLD

max mixture size = 1 Gaussians

     max length of model = 5 states

     logical base phones = 41

       model skip trans. = exist, require multi-path handling

      skippable models = sp (1 model(s))

 

 AM Parameters:

        Gaussian pruning = safe  (-gprune)

  top N mixtures to calc = 2 / 0  (-tmix)

    short pause HMM name = "sp" specified, "sp" applied (physical)  (-sp)

  cross-word CD on pass1 = handle by approx. (use max. prob. of same LC)

   sp transition penalty = -70.0

 

------------------------------------------------------------

Language Model(s)

 

[LM00 "_default"] type=grammar

 

 DFA grammar info:

      6 nodes, 6 arcs, 6 terminal(category) symbols

      category-pair matrix: 32 bytes (712 bytes allocated)

 

 Vocabulary Info:

        vocabulary size  = 18 words, 51 models

        average word len = 2.8 models, 8.5 states

       maximum state num = 15 nodes per word

       transparent words = not exist

       words under class = not exist

 

 Parameters:

   found sp category IDs =

 

------------------------------------------------------------

Recognizer(s)

 

[SR00 "_default"]  AM00 "_default"  +  LM00 "_default"

 

 Lexicon tree:

total node num =    207

 root node num =     18

 leaf node num =     18

 

(-penalty1) IW penalty1 = +5.0

(-penalty2) IW penalty2 = +20.0

(-cmalpha)CM alpha coef = 0.050000

 

inter-word short pause = on (append "sp" for each word tail)

 sp transition penalty = -70.0

 Search parameters: 

   multi-path handling = yes, multi-path mode enabled

(-b) trellis beam width = 200 (-1 or not specified - guessed)

(-bs)score pruning thres= disabled

(-n)search candidate num= 1

(-s)  search stack size = 500

(-m)    search overflow = after 2000 hypothesis poped

       2nd pass method = searching sentence, generating N-best

(-b2)  pass2 beam width = 200

(-lookuprange)lookup range= 5  (tm-5 <= t <tm+5)

(-sb)2nd scan beamthres = 200.0 (in logscore)

(-n)        search till = 1 candidates found

(-output)    and output = 1 candidates out of above

IWCD handling:

  1st pass: approximation (use max. prob. of same LC)

  2nd pass: loose (apply when hypo. is popped and scanned)

all possible words will be expanded in 2nd pass

build_wchmm2() used

lcdset limited by word-pair constraint

short pause segmentation = off

fall back on search fail = off, returns search failure

 

------------------------------------------------------------

Decoding algorithm:

 

1st pass input processing = real time, on-the-fly

1st pass method = 1-best approx. generating indexed trellis

output word confidence measure based on search-time scores

 

------------------------------------------------------------

FrontEnd:

 

 Input stream:

            input type = waveform

          input source = microphone

   device API          = default

         sampling freq. = 16000 Hz

        threaded A/D-in = supported, on

  zero frames stripping = on

        silence cutting = on

            level thres = 4000 / 32767

        zerocross thres = 60 / sec.

            head margin = 300 msec.

            tail margin = 400 msec.

             chunk size = 1000 samples

   long-term DC removal = off

   long-term DC removal = off

   level scaling factor = 1.00 (disabled)

     reject short input = off

     reject  long input = off

 

----------------------- System Information end -----------------------

 

Notice for feature extraction (01),

*************************************************************

* Cepstral mean normalization for real-time decoding:       *

* NOTICE: The first input may not be recognized, since      *

*         no initial mean is available on startup.          *

*************************************************************

Stat: adin_oss: device name = /dev/dsp1 (from AUDIODEV)

Stat: adin_oss: sampling rate = 16000Hz

Stat: adin_oss: going to set latency to 50 msec

Stat: adin_oss: audio I/O Latency = 32 msec (fragment size = 512 samples)

STAT: AD-in thread created

WARNING: adin_thread_process: too long input (> 320000 samples), segmented now

Warning: input buffer overflow: some input may be dropped, so disgard the input

Re: Warning: Too long input(>320000 samples)
User: colbec
Date: 10/2/2016 6:47 am
Views: 3
Rating: 0
PreviousNext