English Speech Files

Nested
ralfherzog-20070831-en1
User: ralfherzog
Date: 8/31/2007 10:46 am
Views: 3944
Rating: 15

Speaker Characteristics:

Gender: male;
Age range: adult;
Pronunciation dialect: General American English.

Recording Information:

Microphone make: Sennheiser PC 131;
Microphone type: noise canceling headset;
Audio card make: Andrea USB adapter;
Audio card type: USB;
Audio Recording Software: Audacity 1.2.6;
O/S: Windows XP Professional.

File Info:

File type: FLAC;
Sampling rate: 48kHz;
Sample rate format: 16bit;
Number of channels: 1;
Audio Processing: no

en1-001 This is a different approach.
en1-002 Now all I need is a little help.  He is very good.
en1-003 What do you want to do first?
en1-004 We haven't had a vacation in a while.
en1-005 He doesn't know where we are going.
en1-006 I know you are here. What exactly do you want?
en1-007 We have been a little concerned about you.
en1-008 You really don't have to worry about her.
en1-009 She is faster than she looks.
en1-010 It is impossible to move that fast.
en1-011 He is at home.  This is what I realize.
en1-012 She needs you no longer.  I need her.
en1-013 I know it is going to be all right.
en1-014 I want to talk to you in person.
en1-015 I am not sure about that.  Get me out of here.
en1-016 Do you think he found him? I wanted to do that.
en1-017 They are fighting against each other.
en1-018 Are you here or not? Get rid of him.
en1-019 There has got to be something.
en1-020 I had no choice.  That was just a joke.
en1-021 I have to get out of here.  I am not slow.
en1-022 I don't want to hurt you.  I can fix that.
en1-023 I am going to ask about that.
en1-024 You have to be quiet now.
en1-025 You left me all alone.
en1-026 I thought I would be able to breathe again.
en1-027 It could have been too late.
en1-028 I don't remember.  We have been through this.
en1-029 It is assumed that these values have been ordered.
en1-030 The lowest and the highest quarter of values have been removed.
en1-031 It is a specified example of a truncated mean.
en1-032 The numbers are interpreted according to their product.
en1-033 Those numbers are defined
en1-034 in relation to the distance per unit of time.
en1-035 A lot of distributions are best described by their mean.
en1-036 The mean is not to be confused with the median.
en1-037 Those means aren't of much use in statistics.
en1-038 Means are often used in analysis and geometry.
en1-039 The mean describes the central location of the data.
en1-040 The mean is the sum of observations.
en1-041 That has to be divided by the number of observations.
en1-042 Not every probability distribution has a defined mean.
en1-043 The arithmetic mean is distinguished from the geometric mean.
en1-044 The checksum should be added.
en1-045 The output will be saved in compressed format.
en1-046 The signal will be using coefficient of 10.
en1-047 The variable should be set to true.
en1-048 It is unnecessary to create coded data.
en1-049 Those are the relevant tools.
en1-050 What is the corresponding output file?
en1-051 Those are the first few lines.
en1-052 We have to avoid several thousand arguments.
en1-053 They have several options.
en1-054 A different procedure is used to code the speech data.
en1-055 Multiple pronunciations are possible.
en1-056 What is the length of the vector?
en1-057 This is really disgusting.  Where are we?
en1-058 Can you be more specific?  I am sure about that.
en1-059 The wheel is already spinning.  Let him play.
en1-060 He is using him to destroy people's lives.
en1-061 I wasn't away. You needed something.
en1-062 You don't know the hell I have been through.
en1-063 That would have been really embarrassing.
en1-064 That is what I have been saying.
en1-065 I would know I was right.  She is going to be fine.
en1-066 I am really worried about her.  Let's find out.
en1-067 You are feeling really bad about this.
en1-068 We have to get a good coverage of phonemes.
en1-069 I will see what I can do.  Thank you very much.
en1-070 It will change your life.  Thank you.
en1-071 I don't believe that.  Congratulations.  I don't understand.
en1-072 Where are you sitting?  At your desk?
en1-073 Tomorrow might not be the best time.
en1-074 I am sorry to disappoint you.  I need it to work.
en1-075 I think now is a good time.
en1-076 Can we just do this later?  Later would be much better.
en1-077 I have to tell you that you were brilliant.
en1-078 I don't want to disturb you.
en1-079 Where are you going?  I am going to Paris.
en1-080 This is a highly effective form of exercise.
en1-081 I wish I could remember.  Me too.
en1-082 It was pretty special.
en1-083 You really don't have to be afraid.
en1-084 Shouldn't someone answer that?
en1-085 I was just so happy to see her.  Anything is possible.
en1-086 That could have been a disaster.
en1-087 He thinks she might have some kind of amnesia.
en1-088 You have been gone for a while, but you are safe now.
en1-089 What do you remember?
en1-090 Do we know who we are?
en1-091 So we know each other?  Really well.
en1-092 Who are you people?  You are back.
en1-093 This is a very new adventure.
en1-094 Is this really happening?
en1-095 That is not true.  I want to go home now.
en1-096 Is this your family?
en1-097 You guys are just in time.  I will jump.
en1-098 My father is trying to get us home.
en1-099 I believe you are looking for this.  Are you OK?
en1-100 Are you in trouble? Who are you people?
en1-101 The people you love are part of my destiny.
en1-102 I don't get why or how I was able to fight.
en1-103 I was just thinking it doesn't make sense.
en1-104 Are you playing games with me?  I play to win.
en1-105 What do you say?
en1-106 Take them out in the desert.  We have a winner.
en1-107 What am I supposed to do?  How do I find out about that?
en1-108 I am so relieved. You don't get it back.
en1-109 You better leave the engine running.
en1-110 What happens to them?
en1-111 Nothing happens to them or will.
en1-112 You really believe I would do that?

Copyright (C) 2007  Ralf Herzog

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program; if not, see <http://www.gnu.org/licenses/>.

--- (Edited on 8/31/2007 10:46 am [GMT-0500] by ralfherzog) ---

ralfherzog-20070831-en1.zip ralfherzog-20070831-en1.zip

Notice: many prompts in "English Speech Files" were adapted from the prompt files contained in the CMU_ARCTIC speech synthesis database, which were in turn derived from out-of-copyright texts from Project Gutenberg, by the FestVox project at the Language Technologies Institute at Carnegie Mellon University.

Re: ralfherzog-20070831-en1
User: kmaclean
Date: 8/31/2007 9:27 pm
Views: 169
Rating: 27

Hi Ralf,

Thanks for this and you last submission!

In addition, thanks for the new set of prompts ... are you an English major?Smile  You could be starting a new form of "speech recognition training" poetry!

One thing, please convert all numbers to their word equivalent in your prompts (i.e. "10" should be "ten"). It's easier (for now ...) to process.  I do have some code to automatically convert this for audio book processing, but have not incorporated it yet into the regular prompt submissions (never needed it thus far ...).

This one will be processed this evening. 

Ken 

--- (Edited on 8/31/2007 10:27 pm [GMT-0400] by kmaclean) ---


Notice: many prompts in "English Speech Files" were adapted from the prompt files contained in the CMU_ARCTIC speech synthesis database, which were in turn derived from out-of-copyright texts from Project Gutenberg, by the FestVox project at the Language Technologies Institute at Carnegie Mellon University.

Re: ralfherzog-20070831-en1
User: ralfherzog
Date: 9/1/2007 3:39 am
Views: 195
Rating: 19
Hello Ken,

I would like to submit more "speech recognition training poetry."  I prefer creating my own prompts.

OK, I will convert numbers into word equivalents.

What is better?  If I submitted audio books?  Or should I continue to submit my own texts here in this section of VoxForge.org?

Thanks for processing my submitted speech. Cool

Greetings, Ralf

--- (Edited on 9/1/2007 3:39 am [GMT-0500] by ralfherzog) ---


Notice: many prompts in "English Speech Files" were adapted from the prompt files contained in the CMU_ARCTIC speech synthesis database, which were in turn derived from out-of-copyright texts from Project Gutenberg, by the FestVox project at the Language Technologies Institute at Carnegie Mellon University.

Re: ralfherzog-20070831-en1
User: kmaclean
Date: 9/2/2007 12:56 pm
Views: 159
Rating: 23

Hi Ralph, 

>?What is better?  If I submitted audio books?  Or should I continue to submit my own texts here in this section of VoxForge.org?

For now, the best for the VoxForge project is the submission of pre-segmented speech audio files with corresponding prompts.  Audio books are great, but they need to be segmented to be processed by the acoustic model training scripts.  This process is currently only semi-automated.
So yes, you can "tap" your creative side and submit your own prompts. Or, you can also take an audiobook (from the  Gutenberg Project  perhaps ...) and segment the text into 10-15 word prompts, and record these. 

thanks, 

Ken 

--- (Edited on 9/2/2007 1:56 pm [GMT-0400] by kmaclean) ---


Notice: many prompts in "English Speech Files" were adapted from the prompt files contained in the CMU_ARCTIC speech synthesis database, which were in turn derived from out-of-copyright texts from Project Gutenberg, by the FestVox project at the Language Technologies Institute at Carnegie Mellon University.

submitting segmented speech
User: ralfherzog
Date: 9/3/2007 9:17 am
Views: 447
Rating: 25
OK, I will continue to submit segmented speech.  Greetings, Ralf

--- (Edited on 9/3/2007 9:17 am [GMT-0500] by ralfherzog) ---


Notice: many prompts in "English Speech Files" were adapted from the prompt files contained in the CMU_ARCTIC speech synthesis database, which were in turn derived from out-of-copyright texts from Project Gutenberg, by the FestVox project at the Language Technologies Institute at Carnegie Mellon University.

PreviousNext