VOICES


Voice Files Provided

A number of Voice files are provided in the speak-data/voices directory. You can select one of these with the -v <voice name> parameter to the speak command.

default
This voice is used if none is specified in the speak command.

en
is the standard default English voice.

en-b
en-c
en-d

are different English voices. These can be considered caricatures of various British accents: Northern, Upper-Class, West Midlands respectively.

en-f, en-fb, en-fc, en-fd
Female versions of the above. Not genuine female voices, just variants with different pitch and formant parameters.

en1
A variations of "en" with Echo.
Adding a little Echo can give a clearer or more interesting sound.

en2 to en8
Variations of "en" with different tonal quality.

esperanto
An illustration of a different language. Esperanto has simple pronunciation rules, and a different stress pattern from English. I don't know how Esperanto is supposed to sound, other than what I've read in an introduction. There are some Esperanto texts on www.gutenberg.org. Text can be either in the Latin3 alphabet, or else use the Latin1 convention of using two-letter combinations (cx, gx, etc).

german
A very cursory attempt at German. This is not a serious implementation, with only very simple and inadequate pronunciation rules, giving many wrong pronunciations and wrong stress placement. I have only a small knowledge of German, from school many years ago, but I can at least tell that the prosody needs considerable adjustment. Also very noticable is the post-vocalic R sound (R which is not followed by a vowel, which doesn't exist in my own speech) sounds very odd, so that will need some work.

Contents of Voice Files

(subject to change)

language  <name>
This parameter should appear first. It selectes default behaviour and characteristics for the language, and sets default values for "phonemes", "dictionary" and other parameters. If omitted, "english" is assumed.

phonemes  <name>
Specifies which set of phonemes to use from those contained in the phontab, phonindex, and phondata data files.

Different voices of the same language can use different phoneme sets, to give different accents. A default "phonemes" value is set by the "language" parameter.

dictionary  <name>
Specifies which pair of dictionary files to use. eg. "english" indicates that speak-data/english_1 and speak_data/english_2 should be used to translate from words to phonemes. This parameter is usually not needed as it is set by default by the "language" parameter.

pitch  <base> <range>
Two integer values. The first gives a base pitch to the voice (value in Hz) The second controls the range of pitches used by the voice. Setting it equal to the base pitch will give a monotone.

formant  <number> <frequency> <strength> <width>
Systematically adjusts the frequency, strength, and width of the resonance peaks of the voice. Values are percentages of the default values. Changing these affects the tone/quality of the voice.

echo  <delay> <amplitude>
Parameter 1 gives the delay in mS (0 to 250mS).
Parameter 2 gives the echo amplitude (0 to 100).
Adding some echo can give a clearer or more interesting sound, especially when listening through a domestic stereo sound system, rather than small computer speakers.

flutter  <value>
Default value: 2.
Adds pitch fluctuations to give a wavering or older-sounding voice. A large value (eg. 20) makes the voice sound "croaky".

roughness  <value>
Default value: 2. Range 0 - 7
Reduces the amplitude of alternate waveform cycles to make the voice sound creaky.

words  <value>
Indicates to what extent words are separated from each other, or run together. A default value is set by "language".

replace  <phoneme> <replacement phoneme>
Replace a phoneme by another whenever it occurs. eg.
      replace  h  NULL      // drops h's
      replace  V  U         // replaces vowel in 'strut' by that in 'foot'
                            // as occurs in northern British English
The phoneme mnemonics are listed in phonemes.html

replaceWE  <phoneme> <replacement phoneme>
Replace a phoneme only when it occurs at the end of a word eg.
      replaceWE  N  n         // change 'fishing' to 'fishin' etc.

stressLength  <8 values>
Eight integer parameters. These control the relative lengths of stressed and unstressed syllables.

intonation  <param1> <param2>
(for further development)
param1 is currently not used
param2 can be 0, 1, 2 (default=0) It affects how often the pitch rises and falls during a clause.