ralfherzog-20070815_vf14.zip

English Speech Files

Flat

User: ralfherzog
Date: 8/14/2007 11:16 pm

Views: 4180
Rating: 20

Hello everyone! I checked my wav audio recordings by transcribing them into the DragonPad of the speech recognition software Dragon Naturally Speaking 9 Preferred. Most of the words were recognized correctly.

Speaker Characteristics:

Gender: male;
Age range: adult;
Pronunciation dialect: General American.

Recording Information:

Microphone make: Sennheiser PC131;
Microphone type: noise canceling headset;
Audio card make: OnBoard - Mainboard: Asrock 939Dual-SATA2;
Audio card type: OnBoard - Realtek 850;
Audio Recording Software: Audacity 1.2.6;
O/S: Windows XP Professional.

File Info:

File type: wav;
Sampling rate: 48kHz;
Sample rate format: 16bit;
Number of channels: 1;
Audio Processing: n

vf14-01 Without them he could not run his empire
vf14-02 For such countries nothing remained but reorganization
vf14-03 They could not continue their method of producing surpluses
vf14-04 At once would be instituted a dozen cooperative commonwealth states
vf14-05 The Oligarchy wanted violence, and it set its agents provocateurs to work
vf14-06 Nowhere did the raw earth appear
vf14-07 The lush vegetation of that sheltered spot make a natural shield
vf14-08 Men who endure it, call it living death
vf14-09 As I say, he had tapped the message very rapidly
vf14-10 Ask him, I laughed, then turned to Pasquini
vf14-11 In what bucolic school of fence he had been taught was beyond imagining
vf14-12 May drought destroy your crops
vf14-13 Dunham, can your boy go along with Jesse
vf14-14 But Johannes could, and did
vf14-15 A new preacher and a new doctrine come to Jerusalem
vf14-16 He would destroy all things that are fixed
vf14-17 He was an enthusiast and a desert dweller
vf14-18 What Pascal glimpsed with the vision of a seer, I have lived
vf14-19 I should like to engage just for one whole life in that
vf14-20 Yea, so are all the lesser animals of today clean
vf14-21 The Warden with a quart of champagne
vf14-22 Without a doubt, some of them have dinner engagements
vf14-23 I had been born with no organic, chemical predisposition toward alcohol
vf14-24 He may anticipate the day of his death
vf14-25 The Italian rancho was a bachelor establishment
vf14-26 I lost my balance and pitched head foremost into the ooze
vf14-27 Men like Joe Goose dated existence from drunk to drunk
vf14-28 Also, churches and preachers I had never known
vf14-29 Do you know that we weigh every pound of coal we burn
vf14-30 This also became part of the daily schedule
vf14-31 All an appearance can know is mirage
vf14-32 Yet he dreams he is immortal, I argue feebly
vf14-33 I am writing these lines in Honolulu, Hawaii
vf14-34 Jack London, Waikiki Beach, Honolulu, Oahu
vf14-35 Jerry was so secure in his nook that he did not roll away
vf14-36 Why, he's bought forty pounds of goods from you already
vf14-37 The last refugee had passed
vf14-38 And the foundation stone of service, in his case, was obedience
vf14-39 Peace be unto you and grace before the Lord
vf14-40 His mouth opened; words shaped vainly on his lips

Copyright (C) 2007 Ralf Herzog

These files are free software; you can redistribute them and/or
modify them under the terms of the GNU General Public License
as published by the Free Software Foundation; either version 2
of the License, or (at your option) any later version.

These files are distributed in the hope that they will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.

--- (Edited on 8/14/2007 11:16 pm [GMT-0500] by ralfherzog) ---

ralfherzog-20070815_vf14.zip

Notice: many prompts in "English Speech Files" were adapted from the prompt files contained in the CMU_ARCTIC speech synthesis database, which were in turn derived from out-of-copyright texts from Project Gutenberg, by the FestVox project at the Language Technologies Institute at Carnegie Mellon University.

Re: ralfherzog-20070815_vf14.zip

User: kmaclean
Date: 8/16/2007 6:47 pm

Views: 197
Rating: 19

Hi Ralph,

You're really dedicated! thanks again.

Very interesting approach to validate your audio. Did you notice anything particular about the words that were not recognized properly by DNS? Do you think it was related to the audio (too loud, too soft, or too much background noise, etc.), or were the words a bit odd/archaic and not something that comes up too often in current conversation?

thanks,

Ken

--- (Edited on 8/16/2007 7:47 pm [GMT-0400] by kmaclean) ---

Quality of onboard sound, noise of CPU cooler fan

User: ralfherzog
Date: 8/17/2007 4:33 pm

Views: 120
Rating: 15

Hello Ken!

Not all words are recognized correctly by DNS 9 Preferred. Why not? Let me explain. I have a lot of experience with DNS 9. I have dictated lots of hours in the German, and in the English language, let's say more than 50 hours in each language. I would say that about 95% are recognized correctly. It is not 99% like they advertise, it is less.

When I checked the audio files "vf14" that I recorded for voxforge.org, I can say, that there are words that are not part of my active vocabulary. That means that DNS 9 doesn't expect me to say those words and interprets wrong.

Some words that are not part of my active language: "Pasquini, Dunham, Goose, Oahu" - those words weren't recognized correctly by DNS 9, which is normal, because I haven't used those words ever before.

I think that I need a better sound card to get better results. So if I would have a better sound card, the recognition rate should improve from 95% to 96%. That sounds little, but it is a lot.

My computer isn't very loud (it could be a little bit quieter, my next computer will be quieter), there is nearly zero other background noise except the noise of the CPU cooler fan (the cooler fan of the power supply is very quiet).

The biggest problem at the moment for me is the onboard sound card. This is the weakest part of my system. The weakest part is not the headset, and it is not the background noise. The second problem that I have to solve is the noise of the CPU cooler fan.

So, I have two problems:
1. Quality of the onboard soundcard.
2. Noise of the CPU cooler fan.

I can't reach a recognition rate of 100%. But I do know that my personal speech model of DNS 9 is able to transcribe my submitted audio files into written language with a recognition rate of about 95%.

I think that for the development of an open source speech recognition software project, it would be a good decision if I would buy a new sound card. My current onboard soundcard is OK for the work with DNS 9. But I am sure that it would be better if I submitted speech that is recorded with a PCI or USB soundcard. But not with a cheap one, it should be a better one. I have a cheap PCI soundcard, and I have cheap USB soundcard, but neither of them is better than my onboard soundcard. So for the moment I have to stick with my onboard soundcard, until I have a better PCI or USB soundcard.

Greetings, Ralf

--- (Edited on 8/17/2007 4:33 pm [GMT-0500] by ralfherzog) ---

--- (Edited on 8/17/2007 4:36 pm [GMT-0500] by ralfherzog) ---

Re: Quality of onboard sound, noise of CPU cooler fan

User: kmaclean
Date: 8/18/2007 2:03 pm

Views: 481
Rating: 23

Hi Ralph,

Thanks for the reply.

One recommendation ... if you are going to purchase a new sound card, make sure you can return it if it doesn't perform as well as expected.

I have always assumed that onboard sound processors tend to not be as good as external cards simply because they get interference from all the other electronics jammed nearby. A PCI card (or USB mic) isolates these electronics. Therefore you tend to get better playback and recording of audio. So using same analysis, even a cheap PCI sound card should do better than your onboard sound... but I guess it depends on how "cheap" the PCI sound card is.

Ken

--- (Edited on 8/18/2007 3:03 pm [GMT-0400] by kmaclean) ---

Previous • Next •


Username	Password