			
			   DECtalk USB Commands


Access Solutions
4536 Edison Ave.
Sacramento, Ca. 95821
Phone: (916) 481-3559
Fax: (916) 482-2250
E-Mail: support@axsol.com






		      TABLE of Contents
------------------------------------------------------------------------------
	Chapter 1       -  Overview
	
	Chapter 2       -  DECtalk In-Line Commands    
		In-Line Commands: Introduction      2-1
		Comma Pause [:comma]    2-2
		Design Voice [:dv]      2-3
		Index Mark [:index mark]        2-4
		Mode [:mode]    2-5
		Name [:name]    2-6
		Period Pause [:period]  2-7
		Phoneme Interpretation [:phoneme]       2-8
		Pitch [:pitch]  2-9
		Pronounce [:pronounce]  2-10
		Punctuation [:punct]    2-11
		Rate Selection [:rate]  2-12
		Say [:say]      2-13
		Sync [:sync]    2-14
		Tone [:tone]    2-15
		Volume [:vol]        2-16

	Chapter 3       -  Using In-Line Commands       
		Introduction                            3-1
		Changing Rhythm, Stress, and Intonation 3-2
		Optimizing the Quality of Spoken Text   3-3
		Index Marks for Speech Status   3-4
		Speaking Rate   3-5
		Adjusting Period and Comma Pause Durations      3-6
		Text-Tuning Example     3-7
		Original Version        3-8
		Revised Version 3-9
		Avoiding Common Errors  3-10

	Chapter 4       -  DECtalk TTS Reference Tables    
		Phonemic Symbols Listed By Language     4-1
		Stress and Syntactic Symbols    4-2
		Phonemes Listed in Unicode Sequence     4-3
		Pitch and Duration of Tones     4-4
		Homographs      4-5

	Chapter 5       -  Customizing a DECtalk Voice
		Design Voice [:dv]      5-1
		Definitions of DECtalk Voices  5-2
		Changing Gender and Head Size   5-3
		Sex, sx 5-4
		Head Size, hs   5-5
		Higher Formants, f4, f5, b4, and b5     5-6
		Changing Voice Quality  5-7
		Breathiness, br 5-8
		Lax Breathiness, lx     5-9
		Smoothness, sm  5-10
		Richness, ri    5-11
		Nopen Fixed, nf 5-12
		Laryngealization, la    5-13
		Changing Pitch and Intonation   5-14
		Baseline Fall, bf       5-15
		Hat Rise, hr    5-16
		Stress Rise, sr 5-17
		Assertiveness, as       5-18
		Quickness, qu   5-19
		Average Pitch, ap, and Pitch Range, pr  5-20
		Changing Relative Gains and Avoiding Overloads  5-21
		Loudness, g5    5-22
		Sound Source Gains, gv, gh, gf, and gn  5-23
		Cascade Vocal Tract Gains, g1, g2, g3, and g4   5-24
		Saving Changes as Val's Voice   5-25
		Save, save      5-26
		Summary of Design Voice Options 5-27

	Chapter 6       -  Preprocessor Rules for Parsing       6-1
		Punctuation Parsing Rules       6-2
		Interpreting Punctuation Marks as Words 6-3
		Interpreting Punctuation Marks as Punctuation   6-4
		General Parsing Rules   6-5
		German  6-6
		Spanish (Castilian and Latin American)  6-7
		English (UK)    6-8
		English (US, UK)        6-9
	
	Glossary


Table 2-1 DECtalk In-Line Commands   
Table 2-2 DECtalk Interpretation of Special Characters 
Table 4-1  Phonemic Symbols - U.S. English     
Table 4-2  Phonemic Symbols - U.K. English      
Table 4-3  Phonemic Symbols - Castilian Spanish 
Table 4-4  Phonemic Symbols - Latin American Spanish   
Table 4-5  Phonemic Symbols - German    
Table 4-6  Phonemic Symbols - French    
Table 4-7 Stress Symbols        
Table 4-8 Syntactic Symbols     
Table 4-9 U.S. English Phonemes in Unicode Sequence     
Table 4-10 Phoneme Syntax for Singing   
Table 4-11 Tone Table   
Table 4-12 Homograph Phonetics - (A)    
Table 4-13 Homograph Phonetics - (B-C)  
Table 4-14 Homograph Phonetics - (D-G)  
Table 4-15 Homograph Phonetics - (I-L)  
Table 4-16 Homograph Phonetics - (M-P)  
Table 4-17 Homograph Phonetics - (R)    
Table 4-18 Homograph Phonetics - (S-W)  
Table 5-1 [:dv] Command Options 
Table 5-2 Speaker Definitions for All DECtalk Software Voices   
Table 5-3 Head Size and Shape Options   
Table 5-4 Voice Quality Options 
Table 5-5 Fundamental Frequency Contour Options 
Table 5-6 Internal Resonator Options    


				  CONVENTIONS

The following conventions are used in this guide:

Convention: XX and YY 
Meaning: In DECtalk in-line command syntax, XX and YY indicate options and 
	 parameters. When more than one choice of options or parameters is 
	 allowed, the symbol XXn or YYn with n replaced by a numeral 
	 indicates each option or parameter in the symbolic 
	 representations, such as [:phoneme XX1 XX2 YY]. Note that the 
	 number of characters in the symbolic representation does NOT 
	 represent the number of characters allowed in the actual option 
	 or parameter name.

Convention: DD and DDn
Meaning: In DECtalk Software in-line command syntax, DD indicates a decimal 
	 (base 10) value. When more than one decimal values are allowed, the 
	 symbol DDn with n replaced by a numeral represents each allowed 
	 value, such as [:volume XX DD1 DD2]. Note that the number of 
	 characters in the symbolic representation does NOT represent the 
	 number of characters allowed in the actual decimal value.



			   Chapter 1  -  Overview

     The DECtalk USB is a serial text to speech synthesizer that 
     may be connected to a computer or piece of equipment via a standard USB 
     or RS232 serial port.  Taking advantage of todays modern technology, the 
     DECtalk USB is a state of the art text to speech synthesizer that 
     is capable of speaking in six different languages, fully software 
     configurable and updatable, light weight and portable, and delivers  
     a quality of speech that is the best of its kind.  Unlike most serial 
     synthesizers, the DECtalk USB takes full advantage of the high speed 
     transfer rates of the USB bus.  In simple terms, start and stop speech 
     commands to the synthesizer appear to be instantaneous.

     As an added bonus, the DECtalk USB comes equiped with a standard,
     battery or externaly powered, RS232 serial port for use with older 
     computers or equipment that does not offer USB support.  The on board 
     RS232 serial port is fully backwards compatible with the older DECtalk 
     Express.  In other    words, the DECtalk USB may be used as a direct 
     replacement for the DECtalk Express.

     The DECtalk USB has 9 predefined voices, multiple volume levels,
     software adjustable speed, pitch, treble, and bass and is fully 
     configurable to meet your listening needs.  In addition,
     parameters such as articulation, punctuation filtering, and intonation are
also adjustable.  It is important to note that these settings may
     be limited by certain software packages such as screen access
     programs.

     All data sent to the DECtalk USB is first passed through an embedded  
command interpreter before reaching its internal text to speech 
engine.  The internal text to speech engine of the DECtalk USB
is based on DECtalk text-to-speech(TTS) synthesis technology from Fonix
Corporation, Inc. and is the primary focus of this manual.

     The purpose of the embedded command interpreter is to provide  
additional control and functionality to the DECtalk TTS engine.  
Commands that effect the command interpreter are refered to as
"Extended Commands".  Features such as pausing speech output,
changing languages, loading a user dictionary etc. are all
handled via the extended commands.  For more information, view
the DECtalk_USB_Extended_Commands.txt file.
     
     The DECtalk TTS engine converts standard ASCII or Unicode
text into highly intelligible and natural sounding speech.  Command
strings that pertain only to the DECtalk TTS engine are refered to as 
"DECtalk commands".  DECtalk command strings always begin with a '['
and end with a ']'.  Changing parameters such as pitch, rate,  
intonation, voice etc. are done via DECtalk commands.  
		  
		  

		  Chapter 2  -  DECtalk In-Line Commands


2.1

Introduction -

In this documentation, in-line commands or DECtalk 
commands are referred to as just commands.  You can use these commands to 
perform simple operations, such as changing the speaking rate 
or speaking voice while DECtalk  is speaking. Commands are inserted directly 
into the ASCII text string that is sent to the DECtalk USB 
and ultimately processed by the DECtalk TTS engine.
Table 2-1 lists the DECtalk in-line commands and their associated functions.

With phoneme interpretation, it is possible to control intonation and stress 
and to create special effects, such as singing. These symbols and special 
effects can be added into the ASCII text stream. See the 
description of the Phoneme Interpretation command for more information.
When you use several commands together, they may interact with each other 
and affect the output. If incorrect syntax is used in a command, the 
right bracket ( ] ) is ignored, because it might be considered part of 
the illegal string. To avoid this situation, insert an extra right bracket ( ] ) 
in the command and use the Error command to enable the speaking of errors.

Note:
     Unique abbreviations of command names and option names work reliably.  
     In addition to the commands fully described in this chapter, DECtalk 
     has a Design Voice command that allows you to modify the 
     characteristics of a voice. For complete information on how to use the 
     Design Voice command to change a voice, see Chapter 5. 
   
Table 2-1 DECtalk In-Line Commands
--------------------------------------------------------------------
Command: Comma Pause
Syntax: [:comma DD] or [:cp DD]
Function: Inserts a comma pause into spoken text
Range: -280 to 30000  (time in ms)
Default: 160  (time in ms)

Command: Design Voice
Syntax: [:dv XX YY]
Function: Customizes a DECtalk Software voice by selecting and setting 
	  speaker-definition options

Command: Error 
Syntax: [:error XX]
Function: Sets the error mode for a module

Command: Index Mark 
Syntax: [:index mark DD]
Function: Inserts marks, which are recognized by the application, into text
Range: 0 - 99

Command: Index Reply 
Syntax: [:index reply DD] or [:i r DD]
Function: Inserts marks, which are recognized by the application, 
	  into the text stream and are automatically returned when spoken.
Range: 0 - 99

Command: Mode
Syntax: [:mode XX YY]
Function: Allows words and symbols to be interpreted for special use.

Command: Name
Syntax: [:name XX] or [:nXX]
Function: Selects the name of the DECtalk voice.

Command: Period Pause 
Syntax: [:period DD] or [:pp DD]
Function: Inserts a pause equivalent to a period in a sentence 
	  into spoken text.
Range: -420 to 30000 (time in ms)
Default: 640 (time in ms)

Command: Phoneme Interpretation
Syntax: [:phoneme XX1 XX2 YY]
Function: Allows everything within brackets to be interpreted as phonemic text.

Command: Pitch
Syntax: [:pitch DD]
Function: Raises by the value specified the frequency of uppercase 
	  letters spoken in typing mode.
Default: 35 (frequency in hertz)

Command: Pronounce
Syntax: [:pronounce XX]
Function: Speaks alternate, primary, or proper noun pronunciation of a word.

Command: Punctuation
Syntax: [:punct XX]
Function: Turns punctuation on and off.

Command: Rate Selection
Syntax: [:rate DD]
Function: Selects speed at which text is spoken.
Range: 75 - 600 (words per minute)
Default: 200 (words per minute)

Command: Say 
Syntax: [:say XX]
Function: Allows DECtalk to speak words before they are queued.

Command: Sync
Syntax: [:sync]
Function: Synchronizes activity within DECtalk TTS engine.

Command: Tone
Syntax: [:tone DD, DD]
Function: Creates tones of a specified length and frequency.

Command: Volume
Syntax: [:vol att DD] or [:volume att DD]
Function: Sets the volume.
Range: 0 - 110
Default: 100

Note:
     Commands are not synchronous unless otherwise stated. To make a command 
     synchronous, use the [:sync] command. See the 
     Sync command for more information.


2.2
 
Command: Comma Pause [:comma]

The Comma Pause command increases or decreases the length of the comma pause 
from the current value by the delta value specified, in milliseconds. This 
command is asynchronous. The comma pause can be increased and decreased. 
The [:cp 0] command resets the comma pause to its default 
state (approximately 160 ms). Comma pauses can be increased 
by 30,000 ms (30000) and decreased by 40 ms (-40). All values outside 
the legal range default to the nearest legal values.

SYNTAX: [:comma DD] 
ABBREVIATION: [:comm DD]
ALTERNATE COMMAND: [:cp DD] and [:cp 0]
OPTIONS: none
PARAMETERS: Pause time in milliseconds.
DEFAULT: 160 ms

EXAMPLES: [:comma 250]
 

2.3

Command: Design Voice [:dv]

The Design Voice command customizes a DECtalk voice by selecting 
and setting speaker-definition options. This command is asynchronous. 
DECtalk voices provide an adequate selection for most applications. 
However, if you have a special application requiring a monotone or 
unusual voice, you can use the Design Voice command to modify any 
DECtalk voice. The speaker-definition options and parameters can 
be entered as a string or one at a time.
The Design Voice command options and parameters are documented and 
explained in Chapter 5.


2.4

Command: Index Mark [:index mark]

Index Mark commands report the progress of the text as it is spoken. 
Index marks are position markers; they do not modify 
heuristics or word pronunciations in any way. The index mark sequence inserts 
a flag into the text stream. When the DECtalk TTS engine 
encounters an Index Mark command, a value between 0 and 99 is returned via the 
selected serial port.  Index marks cannot be put in the middle of a word. This 
command is synchronous.  For more information on using index marks, refer 
to Index Marks for Speech Status. 

SYNTAX: [:index mark DD]
ABBREVIATION: [:inde DD]
ALTERNATE COMMAND: none
OPTIONS: none
PARAMETERS: Numeric index mark value.
DEFAULT: none

EXAMPLES: [:index mark 01]
 

2.5

Command: Mode [:mode]

The Mode command changes the mode for all text processed after this command.  It 
remains in effect until the next Mode command is encountered. This 
is an asynchronous command. Refer to the description of the Sync command 
for information on how to make this command synchronous.

SYNTAX: [:mode XX YY]
ABBREVIATION: None
ALTERNATE COMMAND: None

OPTIONS:

     math - Change interpretation of selected symbols.

     europe - Select European cardinal pronunciation.

     spell - Spell all words

     name - Pronounce all uppercase verbs as proper nouns. 
	    (see also [:pronounce name] command)

PARAMETERS: 
	    on - Turns on the specified mode option.
	    off - Turns off the specified mode option.
	    set - Turns on the specified mode option while turning off 
		  all other mode options.

DEFAULT: All of the mode options are turned off

EXAMPLE: [:mode spell on]

Europe Mode Example:
     When Mode is set to Europe, [mode europe on], a comma (,) is the 
     separator between the integer and fraction part of a number. A period (.) 
     is the separator between 3-digit blocks.
     1.255 (United States) = 1,255 (Europe) 
     125,873 (United States) = 125.873 (Europe)

Math Mode Example:
     When Mode is set to Math, [:mode math on], special symbols and characters 
     are pronounced with mathematical meanings. Specifically, the characters 
     in Table 2-2 are treated differently: 


Table 2-2 DECtalk Interpretation of Special Characters
--------------------------------------------------------------------
Symbol: +
Name: plus
DECtalk Says: plus (no change from normal speech)

Symbol: -
Name: hyphen
DECtalk Says: minus

Symbol: *
Name: asterisk
DECtalk Says: multiplied by

Symbol: /
Name: slash
DECtalk Says: divided by

Symbol: ^
Name: circumflex
DECtalk Says: to the power of

Symbol: <
Name: less than
DECtalk Says: less than

Symbol: >
Name: greater than
DECtalk Says: greater than

Symbol: =
Name: equal sign
DECtalk Says: equals

Symbol: %
Name: percent sign
DECtalk Says: percent

Symbol: .
Name: period
DECtalk Says: decimal point
--------------------------------------------------------------------


Name Mode Example:
     When Mode is set to Name, [:mode name on], uppercase words that occur 
     in locations other than the beginning of a sentence are 
     interpreted as special cases and pronounced as proper names.

Note: 
     Do not enable the [:mode name] command except when pronouncing lists of 
     names. This command interprets any uppercase word as a name. When 
     finished, make sure that this mode is set to off. For the occasional 
     use of this utility, use the [:pronounce name] command.


2.6

Command: Name [:name]

The Name command allows the current speaking voice to be changed to 
one of ten built-in DECtalk voices. XX represents the speaker name 
or letter variable for each voice. The letter variable is 
the first letter of the speaker name. This command is synchronous.

SYNTAX: [:name XX] 
ABBREVIATION: none
ALTERNATE COMMAND: [:nXX]

OPTIONS:

	Speaker Variable Description
	----------------------------
	PAUL    p       Default male voice
	HARRY   h       Full male voice
	FRANK   f       Aged male voice
	DENNIS  d       Male voice
	BETTY   b       Full female voice
	URSULA  u       Aged female voice
	WENDY   w       Whispering female voice
	RITA    r       Female voice
	KIT     k       Child's voice
	VAL     v       Val's voice


PARAMETERS: none
DEFAULT: PAUL

EXAMPLES: [:name KIT] or [:nk]

Notes:
     A user can change any of the voice characteristics of the current speaker 
     by using the Design Voice [:dv] command. These changes are active only 
     while the current speaker remains current. To save the voice changes, 
     use the save option of the Design Voice command, which saves the changes 
     as the voice of Val. For information on the individual characteristics of 
     a speaker or details on how to change a voice using the Design Voice 
     command, see Chapter 5.
	
	The following is a list of languages and their associated
	speaker names.

	Language                Speaker Name
	------------------------------------
	ENGLISH
				Paul
				Harry
				Frank
				Dennis
				Betty
				Ursula
				Wendy
				Rita
				Kit

	SPANISH
				Pablo
				Humberto
				Francisco
				Domingo
				Berta
				Arsula
				Wendy
				Rita
				Juanito

	GERMAN
				Paul
				Hans
				Frank
				Dieter
				Beate
				Ursula
				Wendy
				Rita
				Karsten

	FRENCH
				Oliver
				Michel
				Franois
				Jol
				Marjolaine
				Angle
				Nadia
				Jacqueline
				Sbastien
	-----------------------------------


2.7

Command: Period Pause [:period]

The Period Pause command increases or decreases the length of the period 
pause  from the current value by the delta value specified 
in milliseconds. The [:pp 0] command resets the period pause to its 
default state (approximately 640 ms). Period pauses can be increased 
by 30,000 ms (30000) and decreased by 380 ms (-380). All values 
outside the legal range default to the nearest legal values. This command is 
asynchronous.

SYNTAX: [:period DD] 
ABBREVIATION: [:peri DD]
ALTERNATE COMMAND: [:pp DD] and [:pp 0] 
OPTIONS: none
PARAMETERS: Pause time in milliseconds.
DEFAULT: 640 ms

EXAMPLE: [:period 250]


2.8
       
Command: Phoneme Interpretation [:phoneme]

When phoneme interpretation is set, the Phoneme Interpretation command allows 
everything within brackets to be interpreted as phonemic text. All 
phoneme interpretation of text can be silenced by using the 
[:phoneme silent on] command. By default, the text is spoken without 
phoneme interpretation. This command is asynchronous.
When you phonemicize text, put valid phoneme strings in brackets. 
A list of valid phonemic symbols can be found in Table 4-1 through Table 4-6.
Phoneme interpretation allows you to specify the preferred pronunciation 
of a word or phrase. It is important to note that this command 
sets the left bracket ( [ ) and right bracket ( ] ) characters as 
phoneme delimiters. When the user has the phoneme interpretation turned 
on [:phoneme on], all text and characters that appear between brackets 
are interpreted as phonemic text and is pronounced as such. For example, 
to say the word associate, simply embed the phonemic string 
[axs ' owshiyeyt ] in the text string. Note that the pronunciation 
of the phonemic string is different depending on whether 
phoneme interpretation is on or off. 
When phoneme interpretation is on, additional attributes can be associated 
with the phoneme text. For information on how to code a phoneme 
sequence to produce musical sounds, refer to Chapter 4. For a complete list of 
stress and syntactic symbols that can be used with phoneme text, 
see Table 4-7 and Table 4-8.

Note 
Arpabet mode is a 2-character system. All single character symbols must 
be followed by a space so that faulty translations do not occur. Consider 
the phonemic representation of  "whitehorse,"  
[* w 'ayt hxowr s ]. The letter "t"  in this phonemic representation must 
be followed by a space, so that it is not interpreted as part of the 
phonemic symbol [th] in the representation of "whitehorse."

SYNTAX: [:phoneme XX1 XX2 YY] or [:phoneme arpabet speak on]
ABBREVIATION: [:phon XX1 XX2 YY]
ALTERNATE COMMAND: None

OPTIONS:
	 arpabet - Set phonetic interpretation to arpabet alphabet. 
		   (Currently, this option is the only alphabet allowed.)

	 speak - If phoneme interpretation is on, speak encountered phonemes. 
		 The speak option is ignored if phoneme interpretation is off.

	 silent - If phoneme interpretation is on, do not speak encountered 
		  phonemes. The silent option is ignored if phoneme 
		  interpretation is off.

PARAMETERS:
	    On - Set phoneme interpretation on
	    Off - Set phoneme interpretation off

DEFAULT: Phonetic interpretation is off

EXAMPLES:
	  [:phoneme arpabet speak on] [axs 'owshiyeyt] associate
	  [:phoneme speak on] [axs 'owshiyeyt] associate
	  [:phoneme on] [axs 'owshiyeyt] associate
	  [:phoneme speak off] [axs 'owshiyeyt] pronounced as axsociate
	  [:phoneme off] [axs 'owshiyeyt] pronounced as axsociate
	  [:phoneme silent off] [axs 'owshiyeyt] pronounced as axsociate
	  [:phoneme silent on] [axs 'owshiyeyt] associate not spoken

Note: 
     Make sure that you use a right bracket ( ] ) to end the phonemic symbols. 
     If you do not, any normal text appearing after the phonemic symbols 
     sounds garbled. One right bracket is sufficient to close phonemic mode. 
     It is sometimes useful to begin a text file with a right bracket ( ] ) 
     to ensure that text is not interpreted phonemically. A command sequence 
     consisting of a left bracket followed by a colon ( [: ) is always 
     interpreted as the beginning of a command.
 

2.9

Command: Pitch [:pitch]

The Pitch command raises, by the value specified, the frequency of 
uppercase letters spoken in typing mode using the typing table 
(spoken one letter at a time). The default frequency difference  
between  spoken lowercase and uppercase letters is 35 Hz.  
The frequency difference enables users to distinguish between uppercase 
and lowercase letters. You can  return the pitch increment for 
uppercase letters to the default value by specifying the command [:pitch 35] 
or by restarting Speak. This command is asynchronous.
DECtalk adds the value of the argument, DD (in Hertz), as a pitch 
increment, to the uppercase letters in the next phoneme string it processes. 
However, the Pitch command is asynchronous. Place a Sync command 
in the character stream after the Pitch command to ensure that the 
Pitch command is processed before the letters that follow it 
in the buffer you are using.

SYNTAX: [:pitch DD]
ABBREVIATION: none
ALTERNATE COMMAND: none
OPTIONS: none
PARAMETERS: frequency in hertz
DEFAULT VALUE: 35

EXAMPLE: [:pitch 60] bBcCdD [:pitch 35] eEfFgGhH
 

2.10

Command: Pronounce [:pronounce]

The Pronounce command determines the type of pronunciation for the 
word immediately following this command. This command is synchronous.
Use the [:pronounce alternate] command to obtain an alternative 
pronunciation for a word. See the Homograph tables in Chapter 4 
for examples of primary and alternate pronunciations of words. Using the 
word wind as an example, the primary pronunciation is w ' ihn d, 
as in 'the wind is blowing'. The alternate pronunciation, denoted 
by [:pronounce alternate] wind, is w ' ayn d, as in 'wind up the 
top'.
Use the [:pronounce name] command to pronounce a word as a proper name. 
First names, last names, street names, and place names are all examples 
of proper names.

SYNTAX: [:pronounce XX]
ABBREVIATION: [:pron XX]
ALTERNATE COMMAND: none

OPTIONS:
	 alternate - Uses the alternate pronunciation.
	 primary - Uses the primary pronunciation.
	 name - Uses the proper name pronunciation.

PARAMETERS: none
DEFAULT: Uses the primary pronunciation

EXAMPLE: Terry [:pronounce name] Doucette played [:pronounce primary] bass in the band.


2.11

Command: Punctuation [:punct]

The Punctuation command lets you specify how the DECtalk TTS engine 
treats punctuation marks when it encounters them in text. This command 
is synchronous. The four options of the Punctuation command are:
* none - No punctuation is spoken.
* some - Text is read normally, and punctuation marks are used to mark pauses, 
changes in pitch, and so on.
* all - All punctuation is spoken, for example "," is spoken as "comma."
* pass - Turns off all special punctuation processing. For 
example, periods as part of file names are not spoken.
The pass option is useful in proofreading, as well as in applications 
where special characters are encountered, such as in a computer program. 
See Chapter 6 for more information on preprocessor parsing for treatment 
of punctuation.

Note: 
When the [:punct none] command is used, no punctuation is pronounced, 
although dollar amounts and percentages still are processed.

SYNTAX: [:punct XX]
ABBREVIATION: [:punc XX]
ALTERNATE COMMAND: none
OPTIONS: 
	 none - Punctuation symbols  and some other symbols are not spoken 
		as words; all punctuation is treated as text breaks

	 some - Text is read normally; clause boundary punctuation is not 
		spoken, but all symbols such as $ are spoken as words.

	 all - All punctuation symbols and other symbols are spoken as words.
	 pass - All special punctuation processing is turned off.

PARAMETERS: none
DEFAULT: [:punct some]

EXAMPLE: [:punct none]


2.12

Command: Rate Selection [:rate]

The Rate Selection command sets the speaking rate in the DECtalk TTS engine. 
The rate can range from 75 to 600 words per minute. All values 
outside the range of 75 to 600 default to the nearest legal value. 
For example, if you select a speaking rate of [:rate 880] or 880 words 
per minute, the DECtalk TTS engine defaults to 600 words per minute. The 
DECtalk engine starts at a rate of 200 words per minute by default. 
This command is asynchronous.

SYNTAX: [:rate DDD]
ABBREVIATION: none
ALTERNATE COMMAND: none
OPTIONS: none
PARAMETERS: Rate in words per minute

EXAMPLE: [:rate 400]


2.13

Command: Say [:say]

The Say command  specifies when speaking begins.  The Say command options 
are speak on end of clause (clause), speak on end of word (word), speak on end 
of letter (letter), and speak on end of line (line). This command 
is synchronous. 
In the DECtalk TTS engine, each clause, word, or letter is spoken as it is 
queued. In word and letter mode, DECtalk does not need to wait for a 
clause terminator to begin speaking. Word mode is similar to letter mode 
except text is spoken a word at a time. A space after a character or string 
of characters causes that string to be spoken. This mode interacts with 
the rate selection command so you can increase or decrease the rate at 
which the text is spoken. In clause mode, speaking starts when the DECtalk 
TTS engine is sent a clause terminator (period, comma, 
exclamation point, or question mark) followed by a space. There is no 
time-out limit. This is the normal mode where text is spoken a 
phrase, clause, or sentence at a time. Clause mode is the default mode.

SYNTAX: [:say XX]
ABBREVIATION: none
ALTERNATE COMMAND: none

OPTIONS:
	 clause - Speak on end of clause.
	 word - Speak on end of word.
	 letter - Speak on end of letter.
	 
	 filtered letter - Speak on end of letter, ignoring control 
			   characters, such as "vertical tab" 
			and "line feed"

	 line - Speak on end of line.

PARAMETERS: none
DEFAULT: [:say clause]

EXAMPLE: [:say word]

Note 
     In letter mode, the left bracket is spoken only after the next character 
     is entered because DECtalk needs to know if this is the beginning of 
     a new command.


2.14

Command: Sync [:sync]

The Sync command provides coordination between an application program 
and the DECtalk TTS engine. This command is synchronous.

When the Sync command is sent to DECtalk, the DECtalk TTS engine finishes 
speaking any pending text before processing the next text command. 
This command also acts as a clause boundary, just the same as a comma, 
period, exclamation point, question mark, semicolon, or colon when followed by a 
space.
Some DECtalk inline commands are asynchronous. To ensure that 
these commands are processed before the text following them, place a 
Sync command after an asynchronous command that you want to synchronize. In 
the case of the Pause command, you need to place a Sync command before the 
Pause command to guarantee that all text preceding the Pause command is 
processed before the pause occurs.

SYNTAX: [:sync]
ABBREVIATION: none
ALTERNATE COMMAND: none
OPTIONS: none
PARAMETERS: none
DEFAULT: N/A

EXAMPLE: My name is Bill S [:sync]


2.15

Command: Tone [:tone]

The Tone command is a synchronous command that generates sounds of different 
frequencies and lengths based on the parameters you set. This command 
allows you to make a wide variety of sounds for purposes such as notification 
or warnings. Regular tones can also be used for a number of other purposes, 
such as indications of a margin bell. This command is synchronous. 

SYNTAX: [:tone DD, DD]
ABBREVIATION: none
ALTERNATE COMMAND: none
OPTIONS: none

PARAMETERS: 
	    Frequency: - Sets the frequency to the desired level
	    Duration: - Tone duration in milliseconds

DEFAULT: none

EXAMPLE: [:tone 500,500]


2.16

Command: Volume [:volume xx dd]
									 
The Volume command is a synchronous command that changes the volume 
settings. The DECtalk TTS engine changes the audio system gain in 
increments from 0 to 140, in decibels (dB). Increments or decrements 
of 10 to 20 provide a perceptible increase or decrease in volume. 
The xx parameter is used to select which volume source is to be adjusted.
I.E ATT or TONE
The DD parameter is used to set the volume level and must be in the 
range from 0 to 140 for the "ATT" option and in the range from  
0 to 100 for the "TONE" option.

The "ATT" option is used to set the over all volume level of speech
output.
The "Tone" option is used to set the desired volume level of tones
generated by the [:tone] command. For more information on the
[:tone] command, refer to section 2-15.

SYNTAX: [:volume att DD]
ABBREVIATION: [:vol xx DD]
ALTERNATE COMMAND: none
OPTIONS: 
  ATT - Sets the volume of speech.
  TONE - Sets volume level of tones generated by the [:tone] command.

DEFAULT: 100
EXAMPLE: [:volume att 30]



		    Chapter 3  -  Using In-Line Commands


3.1

This chapter provides an in-depth look at the DECtalk in-line 
commands, commands that can be sent to the DECtalk USB  intended to be 
parced by the DECtalk TTS engine.

Topics include:
  Changing Rhythm, Stress, and Intonation
  Optimizing the Quality of Spoken Text
  Index Marks for Speech Status
  Speaking Rate
  Adjusting Period and Comma Pause Duration
  Text-Tuning Example
  Avoiding Common Errors


3.2

Changing Rhythm, Stress, and Intonation

The DECtalk TTS engine uses stress and syntactic symbols to control aspects 
of rhythm, stress, and intonation patterns within a spoken text buffer. 
These symbols include punctuation marks such as commas, periods, and 
parentheses. Punctuation marks are recognized by the DECtalk TTS engine as 
indicating special phrasing requirements.

Table 4-8 lists these symbols.

   In many applications, the listener might want to write down number 
   strings (such as prices or telephone numbers). Your application can 
   scan the text for strings of numbers and, when they are found, send 
   them to the DECtalk TTS engine in a way that includes pauses at critical 
   locations. For example:

      The number is, 1 (800) 5 5 5, 1 2 3 4. [:rate 120]

      That is, [_<300>] 1 (800), [_<500>] 5 5 5,
      [_<900>] 1 2 3 4. [:rate 180].

   Refer to Table 4-1 through Table 4-6 for a complete list of phoneme 
   symbols, including the silent underscore ( _ ) symbol. See Chapter 
   4 for the syntax to add duration and pitch to phoneme text.

   The spaces between the numbers ensure that five five five is 
   spoken rather than five hundred fifty five. You can also use the 
   [:mode spell on] command to produce the same results. The slower 
   speaking rate, [:rate 120], and the silence phonemes, [_<300>], 
   [_<500>], [_<900>], of specified duration, were carefully selected 
   to allow enough time for the listener to write down the entire
   number. Silence phonemes were positioned after the commas (that is, 
   [_<300>] 1 (800), [_<500>]), to maintain appropriate intonation.

   As another example, if your application is required to speak sums of 
   money (such as bank balances or item costs), you might code the text 
   to say:

      Your balance is $244.05. That is, 2 4 4, [_<400>] point 0
      5, [_<400>] dollars.

  When spelling an item, your application might need to distinguish 
   the case of letters. Consider using the Pitch command (see Chapter 
   2) or different voices to distinguish between uppercase and 
   lowercase letters. For example:

      [:nf]Maynard [:nf]M[:nb]a y n a r d [:nf]Maynard.


3.3

Optimizing the Quality of Spoken Text

DECtalk can generally choose correct pronunciations by itself. 
For example, if you enter the following sentences:

   He produced a lot of REFUSE. He REFUSEd the produce.

   He INSERTS 5 INSERTS per minute. He DELIBERATEd DELIBERATEly for a 
   long time.

Generally, DECtalk Software correctly selects the proper homograph. 
However, in certain unique contexts, the following user intervention 
may be needed:

  Replace the correct spelling of the word with a clever misspelling.
      I red yesterday that . . .

  Spell the word phonetically.
      I [r  ehd ] yesterday that . . .

Note:
     For words that have two pronunciations (homographs), see the Homograph 
     tables in Chapter 4.

Additionally, use the following steps to optimize spoken text.

1. If a word is a compound, use a hyphenated spelling to help DECtalk 
    see the two parts of the compound.

      The slide-show host . . .

2. Replace the text version by a phonemic string. Use the commands and 
   phonemic symbols, but make sure to place the lexical stress pattern 
   correctly.

3. Now that each word has been pronounced in the best possible way, 
   listen to the total sentence rhythm and accent pattern. If it is not 
   right, follow these steps.

   (a) If it sounds as if there should be a short pause in a specific 
       sentence location, but DECtalk says the sentence without 
       a pause, insert a comma between the words in question.

   (b) If the wrong word is emphasized in the sentence, emphasize the 
       word that is supposed to take the emphasis with the correct 
       stress symbols.

       The ["] younger man is the trouble-maker, not the older one.

   (c) Use the stress symbols slash [/], backslash [\], and slash and 
       backslash [/ \] to make final adjustments. Refer to Table 4-7 
       for a complete list of stress symbols.


3.4

Index Marks for Speech Status

By embedding an Index Mark command in text, you can provide non-blocking
synchronization. the DECtalk TTS engine can use index marks to track exactly 
when the text was spoken. The index marks bind themselves to the next 
speech sound, so you MUST always include a sound after the Index Mark 
command. Therefore, if you send, Hello. [:index mark 5], the DECtalk 
TTS engine will wait until the next sound to send the mark to the 
application. Index marks cannot be put in the middle of a word.


3.5

Speaking Rate -

The default speaking rate is 200 words per minute (WPM). DECtalk 
speaking rates range from 75 to 600 WPM. In the Rate command, 
valid speaking rates are between 75 and 600. Rates specified outside 
this range are limited to the nearest legal value. Speaking rates can 
be adjusted to very slow, very fast, or anywhere in between by using 
the following commands:

  [:rate 120]

   Although the slowest possible rate is 75 WPM, 120 WPM is ideal for
   information such as phone numbers, which need to be copied down by a
   listener. Unless the listener is actually copying down each numeral, 
   it might be frustrating to listen to extended speech at slow rates.

  [:rate 160]

   This rate is moderate (160 WPM). It sounds a little slow, but is 
   sometimes preferred when DECtalk Software is speaking math equations 
   or long lists of acronyms.

  [:rate 200]

   This is the default rate for DECtalk Software (200 WPM). This rate 
   is ideal for listening to continuous text under optimal conditions.

  [:rate 240]

   Experienced listeners may prefer to skim material at this rate 
   (240 WPM). Inexperienced listeners may not understand every word at 
   this rate.

  [:rate 350]

   This rate (350 WPM) is too fast to follow, but can be used to 
   quickly scan sections of text.

  [:rate 550]

   This rate (550 WPM) is the fastest usable rate. It is too fast for 
   most people to follow, but can be used to scan text very quickly.

Changes in the speaking rate influence the duration and the number of 
pauses in text, as well as the duration of individual phonemes. At 
rates below 140 WPM, the DECtalk TTS engine inserts pauses at all phrase 
boundaries and pauses, and inserts phonemes near the ends of phrases. 
At rates faster than 240 WPM, the DECtalk TTS engine deletes all pauses and 
shortens phonemes.


3.6

Adjusting Period and Comma Pause Durations -

At the default speaking rate of 200 WPM, DECtalk pauses about 
half a second after a period in the text and about a sixth of a second 
after a comma. When you change the speaking rate, the pause durations 
are automatically adjusted.

In some situations, you might prefer to change the pause after a period 
or a comma without changing the speaking rate. For example, to get 
the DECtalk TTS engine to read a list of words with a longer pause after 
each (to allow the listener to write them down), use the Period Pause 
command or the Comma Pause command.

  [:period 4500] apple. banana. strawberry.

   This command adds a period pause of 4,500 ms (4.5 seconds) to the 
   standard half-second pause that occurs after a period in text. The 
   total pause between words is about five seconds. The accepted range 
   for the period pause parameter is -380 to 30,000 ms. A negative value 
   for this parameter shortens the standard period pause.

  [:comma 4800] apple, banana, strawberry,

   This command adds a comma pause of 4,800 ms (4.8 seconds) to the 
   standard sixth of a second pause that occurs after a comma in the 
   text at normal speaking rate. The total pause between words 
   separated by a comma is about five seconds. The accepted range for 
   the comma pause parameter is -40 to 30,000 ms. Values specified 
   outside this range are limited to the nearest legal value.

  [:pp 0 :cp 0]

   This command resets the period pause and comma pause to their 
   normal default values.


3.7

Text-Tuning Example -

Although the DECtalk TTS engine allows for natural text-to-speech synthesis, 
the quality of speech can often be enhanced by giving it a more natural 
flow. Much of this tuning involves strategic placement of commas and 
periods, which tell the application to pause. The spoken language and 
written text are different, because spoken text generally does not 
contain information about pausing.

The text that follows is presented twice: the first time as originally 
written, and the second time after phonemic and textual fixes were 
applied. For a complete list of stress and syntactic symbols, refer to 
Table 4-7 and Table 4-8.

Original Version

[:np] A California Shaggy Bear Tale for Seven DECtalk Voices
by Dennis Klatt
[:np] Once upon a time, there were three bears.
They lived in the great forest and tried to adjust to modern times.
[:nh] Im papa bear. I love my family but I love honey best.
[:nb] Im mama bear. Being a mama bear is a drag.
[:nk] Im baby bear and I have trouble relating to all of the demands 
of older bears.
[:np] One day, the three bears left their condominium to search for 
honey. While they were gone, a beautiful young lady snuck into the 
bedroom through an open window.
[:nw] My name is Wendy. My purpose in entering this building should be
clear. I am planning to steal the family jewels.
[:np] Hot on her trail was the famous police detective, Frank.
[:nf] Have you seen a lady carrying a laundry bag over her shoulder?
[:np] A woman kneeling with her left ear firmly placed against a large 
rock responded.
[:nu] No. No one passed this way. Ive been listening for earthquakes 
all morning, but have only spotted three bears searching for honey.


3.8

Revised Version -

In this section, text from the original example is enhanced with 
DECtalk embedded commands. Phoneme interpretation is turned on 
to allow the stress and syntactic symbols to be translated. See the 
Phoneme Interpretation command for more information.

Turn on phoneme interpretation
[:phoneme arpabet speak on]

Add periods to add brief pauses after the title and author.
[:np] A California Shaggy Bear Tale for Seven DECtalk Voices.
By Dennis Klatt.
[:np] Once upon a time, there were three bears.
They lived in the great forest and tried to adjust to modern times.

Add commas to increase pause length and quotation marks for emphatic
stress.
[:nh] Im papa bear. I love my family, but I love ["]honey best.
[:nb] Im mama bear. Being a mama bear is a drag.
[:nk] Im baby bear and I have trouble relating to all of the demands 
of older bears.
[:np] One day, the three bears left their condominium to search for 
honey. While they were gone, a beautiful young lady snuck into the 
bedroom through an open window.
[:nw] My name is Wendy. My purpose in entering this building should be
clear. I am planning to steal the family jewels.
[:np] Hot on her trail was the famous police detective, Frank.
[:nf] Have you seen a lady carrying a laundry bag over her shoulder?

Add commas to increase pause length and phrasing.
[:np] A woman, kneeling with her left ear firmly placed against a 
large rock, responded.

If the selected language supports pitch rise and fall symbols [/ \] 
and emphatic stress symbols [  ], use them to add pitch control and 
emphatic stress.
[:nu] []No. No [/]one passed this [/ \]way. Ive been listening for
[]earthquakes all morning, but have only spotted three bears searching 
for honey.


3.9

Avoiding Common Errors -

When sending text and commands to the DECtalk TTS engine, try to avoid 
making common errors by doing the following:

  When you make voice-selection changes, always return to the default 
   voice you have chosen. If you forget to return DECtalk to 
   the default voice after using one of the other voices, all future 
   text uses the currently selected voice.

  Enter a right bracket ( ] ) at the beginning of your text if you use 
   the Phoneme Interpretation command.

  If the [:phoneme arpabet speak on] command is entered to allow 
   phonemic input, it is possible for the DECtalk TTS engine to enter 
   phonemic mode unintentionally.

   -  If the text being spoken contains an unexpected left bracket 
      ( [ ), all text after the left bracket ( [ ) is interpreted as 
      phoneme text. In the following example, apple, banana, 
      strawberry is interpreted as phoneme text.

      [:phoneme arpabet speak on] Here is the list [apple, banana, 
      strawberry].

   -  If you forget to enter a right ( ] ) bracket after a phonemic 
      entry, all text after the missing right bracket ( ] ) is 
      interpreted as phoneme text. In the following example, Ladies 
      and Gentlemen is interpreted as phoneme text.

      [:phoneme arpabet speak on Ladies and Gentlemen



		Chapter 4 - DECtalk TTS Reference Tables

DECtalk TTS reference tables include:
  Phonemic Symbols Listed By Language
  Stress and Syntactic Symbols
  Phonemes Listed in Unicode Sequence
  Pitch and Duration of Tones
  Homographs


4.1

Phonemic Symbols Listed By Language -

The phonemic symbol can be used to replace words that are spoken 
incorrectly. See the Phoneme Interpretation command in Chapter 2 for 
information on how to use phonemic symbols.

The DECtalk TTS engine provides a unified phoneme set for all supported 
languages, allowing you to specify phonemes from different languages 
within the context of your current language.

This section lists the phonemic symbols DECtalk uses for each 
supported language, as follows:

  Table 4-1 Phonemic Symbols - U.S. English
  Table 4-2 Phonemic Symbols - U.K. English
  Table 4-3 Phonemic Symbols - Castilian Spanish
  Table 4-4 Phonemic Symbols - Latin American Spanish
  Table 4-5 Phonemic Symbols - German
  Table 4-6 Phonemic Symbols - French

Some dictionaries put the stress symbol after the vowel nucleus or at 
the start of the syllable. the DECtalk TTS engine requires that the stress 
symbol appear immediately before a syllable nucleus. Table 4-7 lists 
the supported stress symbols.

Phonemes can also be given duration and pitch attributes to create 
special effects, such as singing. See Table 4-10 for additional 
information.

Note:
     Arpabet mode is a 2-character system. All single character symbols must 
     be followed by a space so that faulty translations do not occur. Consider 
     the phonemic representation of whitehorse, [* w ayt hxowr s ]. The 
     letter t in this phonemic representation must be followed by a space, 
     so that it is not interpreted as part of the phonemic symbol [th] in the
     representation of whitehorse.


Table 4-1 Phonemic Symbols - U.S. English
______________________________________________________________________
ASCKY  DT     DT        Example  Arpabet  Unicode  Unicode 
       index  internal                             Character Name
______________________________________________________________________
_      0      SIL      (silence) _        U+5F     Low line
i      1      US_IY     bEAn     iy       U+69     Latin small 
						   letter I
I      2      US_IH     pIt      ih       U+26A    Latin small letter 
						   Capital I
e      3      US_EY     bAY      ey       U+65     Latin small letter E
E      4      US_EH     pEt      eh       U+25B    Latin small letter 
						   open E
@      5      US_AE     pAt      ae       U+E6     Latin small letter 
						   AE
a      6      US_AA     pOt      aa       U+251    Latin small letter 
						   Alpha
A      7      US_AY     bUY      ay       U+61,    Latin small letter 
					  U+26A    A + Latin small 
						   capital I
W      8      US_AW     brOW     aw       U+61,    Latin small letter 
					  U+28A    A + Latin small 
						   capital Upsilon
^      9      US_AH     pUtt     ah       U+28C    Latin small letter 
						   turned V
c      10     US_AO     bOUght   ao       U+254    Latin small letter O
o      11     US_OW     nO       ow       U+6F,    Latin small letter 
						   O +
					  U+28A    Latin small letter 
						   Upsilon
O      12     US_OY     bOY      oy       U+254,   Latin small letter 
					  U+26A    open O + Latin small 
						   letter capital I
U      13     US_UH     pUt      uh       U+28A    Latin small letter 
						   Upsilon                        
u      14     US_UW     bOOn     uw       U+75     Latin small letter U
R      15     US_RR     anothER  rr       U+25A    Latin small letter 
						   Schwa with hook
Y      16     US_YU     cUte     yu       U+6A,    Latin small letter 
					  U+75     J + Latin small 
						   letter U
x      17     US_AX     About    ax       U+259    Latin small letter 
						   Schwa
|      18     US_IX     kissEs   ix       U+268    Latin small letter 
						   I with stroke
I      19     US_IR     pEEr     ir       U+69,    Latin small letter 
					  U+2B4    I + modifier letter 
						   small turned R
R      20     US_ER     pAIr     er
a      21     US_AR     bARn     ar       U+251,   Latin small letter 
					  U+2B4    Alpha + modifier 
						   letter small
______________________________________________________________________


Table 4-1 Phonemic Symbols - U.S. English Continued...
______________________________________________________________________
ASCKY  DT     DT        Example  Arpabet  Unicode  Unicode 
       index  internal                             Character Name
______________________________________________________________________
					  U+2B4    turned R
c      22     US_OR     bOrn     or       U+254,   Latin small letter 
					  U+2B4    open O + modifier 
						   letter small turned 
						   R
U      23     US_UR     pOOr     ur       U+28A,   Latin small letter 
					  U+2B4    Upsilon + modifier 
						   letter small
						   turned R
w      24     US_W      Why      w        U+77     Latin small letter 
						   W                                      
Y      25     US_Y      Yank     yx       U+6A     Latin small letter 
						   J
r      26     US_R      Rat      r        U+52     Latin capital letter 
						   R
l      27     US_LL     Lad      l        U+6C     Latin small letter 
						   L
h      28     US_HX     Had      hx       U+68     Latin small letter 
						   H
R      29     US_RX     coRe     rx       U+279    Latin small letter 
						   turned R with hook
l      30     US_LX     untiL    lx       U+26B    Latin small letter 
						   I with middle tilde
m      31     US_M      Mad      m        U+6D     Latin small letter 
						   M
n      32     US_N      Nat      n        U+6E     Latin small letter 
						   N
G      33     US_NX     baNG     nx       U+14B    Latin small letter 
						   Eng
L      34     US_EL     dangLe   el       U+6C,    Latin small letter 
					  U+329    L combining vertical 
						   line below
D      35     US_DZ     wiDth    dz       U+64,    Latin small letter 
					  U+32F    D + combining 
						   inverted breve below
N      36     US_EN     burdeN   en       U+6E,    Latin small letter 
					  U+329    N + combining 
						   vertical line below
f      37     US_F      Fat      f        U+66     Latin small letter 
						   F
v      38     US_V      Vat      v        U+76     Latin small letter 
						   V
T      39     US_TH     THin     th       U+3B8    Greek small letter 
						   Theta
D      40     US_DH     THen     dh       U+F0     Latin small letter 
						   Eth
s      41     US_S      Sap      s        U+73     Latin small letter 
						   S
z      42     US_Z      Zap      z        U+7A     Latin small letter 
						   Z
______________________________________________________________________


Table 4-1 Phonemic Symbols - U.S. English Continued...
______________________________________________________________________
ASCKY  DT        DT     Example  Arpabet  Uni-   Unicode Character  
       internal  index                    code   Name
______________________________________________________________________
S      43        US_SH  SHeep    sh       U+283  Latin small letter 
						 Esh
Z      44        US_ZH  meaSure  zh       U+292  Latin small letter 
						 Ezh
p      45        US_P   Pat      p        U+70   Latin small letter P
b      46        US_B   Bad      b        U+62   Latin small letter B
t      47        US_T   Tack     t        U+74   Latin small letter T
d      48        US_D   Dad      d        U+64   Latin small letter D
k      49        US_K   Cad      k        U+6B   Latin small letter K
g      50        US_G   Game     g        U+67   Latin small letter G
&      51        US_DX  riDer    dx       Internal 
					  use only
Q      52        US_TX  baTTen   tx       U+74,  Latin small letter T  
					  U+294  + Latin letter  
						 glottal stop
q      53        US_Q   we eat   q        U+294  Latin letter glottal 
						 stop
C      54        US_CH  CHeap    ch       U+2A7  Latin small letter 
						 Tesh digraph
J      55        US_JH  Jeep     jh       U+2A4  Latin small letter 
						 Dezh digraph
F      56        US_DF  wriTer   df       Internal 
					  use only
       57        US_TZ           tz              Hebrew complement
       58        US_CZ           cz              Hebrew complement 
_____________________________________________________________________


Table 4-2 Phonemic Symbols - U.K. English
____________________________________________
ASCKY   DT      DT         Example   Arpabet
	index   internal
____________________________________________
_       0       SIL       (silence)  _
i       1       UK_IY      bEAn      iy
I       2       UK_IH      pIt       ih
e       3       UK_EY      bAY       ey
E       4       UK_EH      pEt       eh
@       5       UK_AE      pAt       ae
a       6       UK_AA      pOt       aa
A       7       UK_AY      bUY       ay
W       8       UK_AW      brOW      aw
^       9       UK_AH      pUtt      ah
c       10      UK_AO      bOUght    ao
o       11      UK_OW      nO        ow
O       12      UK_OY      bOY       oy
U       13      UK_UH      pUt       uh
u       14      UK_UW      bOOn      uw
R       15      UK_RR      anothER   rr
Y       16      UK_YU      cUte      yu
x       17      UK_AX      About     ax
|       18      UK_IX      kissEs    ix
I       19      UK_IR      pEEr      ir
R       20      UK_ER      pAIr      er
a       21      UK_AR      bARn      ar
c       22      UK_OR      bOrn      or
U       23      UK_UR      pOOr      ur
w       24      UK_W       Why       w
Y       25      UK_Y       Yank      yx
r       26      UK_R       Rat       r
l       27      UK_LL      Lad       l
h       28      UK_HX      Had       hx
___________________________________________


Table 4-2 Phonemic Symbols - U.K. English Continued...
___________________________________________
ASCKY   DT      DT         Example   Arpabet
	index   internal
___________________________________________
	29      UK_OH      oh
l       30      UK_LX      untiL     lx
m       31      UK_M       Mad       m
n       32      UK_N       Nat       n
G       33      UK_NX      baNG      nx
L       34      UK_EL      dangLe    el
D       35      UK_DZ      wiDth     dz
N       36      UK_EN      burdeN    en
f       37      UK_F       Fat       f
v       38      UK_V       Vat       v
T       39      UK_TH      THin      th
D       40      UK_DH      THen      dh
s       41      UK_S       Sap       s
z       42      UK_Z       Zap       z
S       43      UK_SH      SHeep     sh
Z       44      UK_ZH      meaSure   zh
p       45      UK_P       Pat       p
b       46      UK_B       Bad       b
t       47      UK_T       Tack      t
d       48      UK_D       Dad       d
k       49      UK_K       Cad       k
g       50      UK_G       Game      g
&       51      UK_DX      riDer     dx
Q       52      UK_TX      baTTen    tx
q       53      UK_Q       we eat    q
C       54      UK_CH      CHeap     ch
J       55      UK_JH      Jeep      jh
F       56      UK_DF      wriTer    df 
___________________________________________


Table 4-3 Phonemic Symbols - Castilian Spanish
___________________________________________
ASCKY   DT      DT         Example   Arpabet
	index   internal
___________________________________________
_       0       SIL       (silence)  _
	1       SP_A       Palabra   a
	2       SP_E       Leo       e
	3       SP_I       Hilo      i
	4       SP_O       Hola      o
	5       SP_U       Lunes     u
	6       SP_WX     (Rounded   wx
			   dipthong
			   semiv.)
	7       SP_YX     (Unround   yx
			   dipthong
			   semiv.)                         
	8       SP_RR      Rama      rr
	9       SP_L       Luna      l
	10      SP_LL      Calle     ll
	11      SP_M       Mama     m
	12      SP_N       Nana      n
	13      SP_NH      Munoz     nh
	14      SP_F       Feo       f
	15      SP_S       Casa      s
	16      SP_J       Caja      j
	17      SP_TH      Caza      th
	18      SP_BH      Haba      bh
	19      SP_DH      Hada      dh
	20      SP_GH      Haga      gh
	21      SP_YH      Yate      yh
			  (affricate)
	22      SP_P       Papa     p
	23      SP_B       Barco     b
	24      SP_T       Tela      t
___________________________________________


Table 4-3 Phonemic Symbols - Castilian Spanish Continued...  
___________________________________________
ASCKY   DT      DT         Example   Arpabet
	index   internal
___________________________________________
	25      SP_D       Dama      d
	26      SP_K       Casa      k
	27      SP_G       Gasa      g
	28      SP_CH      Charco    ch
	29      SP_Y       Haya      y
			  (fricitive)
	30      SP_R       Sara      r
	31      SP_Q       ~n        q
			  (offglide)
	32      SP_Z       Desde     z
	33      SP_W       Hueso     w
	34      SP_NX      Mango     nx
	35      SP_V       Afgano    v
	36      SP_IX      ~n        ix
			  (offglide)
	37      SP_MX      Infierno  mx
			  (nf)   
	38      SP_PH      Observar  ph
___________________________________________


Table 4-4 Phonemic Symbols - Latin American Spanish
___________________________________________
ASCKY   DT      DT         Example   Arpabet
	index   internal
___________________________________________
_       0       SIL       (silence)  _
	1       LA_A       Palabra   a
	2       LA_E       Leo       e
	3       LA_I       Hilo      i
	4       LA_O       Hola      o
	5       LA_U       Lunes     u
	6       LA_WX     (Rounded   wx
			   dipthong
			   semiv.)
	7       LA_YX     (Unround   yx
			   dipthong
			   semiv.)
	8       LA_RR      Rama      rr
	9       LA_L       Luna      l
	10      LA_LL      Calle     ll
	11      LA_M       Mama     m
	12      LA_N       Nana      n
	13      LA_NH      Munoz     nh
	14      LA_F       Feo       f
	15      LA_S       Casa      s
	16      LA_J       Caja      j
	17      LA_TH      Caza      th
	18      LA_BH      Haba      bh
	19      LA_DH      Hada      dh
	20      LA_GH      Haga      gh
	21      LA_YH      Yate      yh
			  (affricate)
	22      LA_P       Papa     p
	23      LA_B       Barco     b
	24      LA_T       Tela      t
___________________________________________


Table 4-4 Phonemic Symbols - Latin American Spanish Continued...  
___________________________________________
ASCKY   DT      DT         Example   Arpabet
	index   internal
___________________________________________
	25      LA_D       Dama      d
	26      LA_K       Casa      k
	27      LA_G       Gasa      g
	28      LA_CH      Charco    ch
	29      LA_Y       Haya      y
			  (fricitive)
	30      LA_R       Sara      r
	31      LA_Q       ~n        q
			  (offglide)
	32      LA_Z       Desde     z
	33      LA_W       Hueso     w
	34      LA_NX      Mango     nx
	35      LA_V       Afgano    v
	36      LA_IX      ~n        ix
			  (offglide)
	37      LA_MX      Infierno  mx
			  (nf)
	38      LA_PH      Observar  ph
____________________________________________


Table 4-5 Phonemic Symbols - German
___________________________________________
ASCKY   DT      DT         Example   Arpabet
	index   internal
___________________________________________
_       0       SIL       (silence)  _
	1       GR_A       mAnn      a
	2       GR_E       Englisch  e
	3       GR_AE      hAEtte    ae
	4       GR_EX      gabE      ex
	5       GR_I       mIt       i
	6       GR_O       pOst      o
	7       GR_OE      kOEnnen   oe
	8       GR_U       mUnd      u
	9       GR_UE      lUEcke    ue
	10      GR_AH      sAgen     ah
	11      GR_EH      gEben     eh
	12      GR_AZ      wAEhlen   az
	13      GR_IH      lIEb      ih
	14      GR_OH      mOnd      oh
	15      GR_OZ      mOEgen    oz
	16      GR_UH      hUt       uh
	17      GR_UZ      hUEten    uz
	18      GR_EI      klEId     ei
	19      GR_AU      hAUs      au
	20      GR_EU      hEUte     eu
	21      GR_AN      pENsion   an
	22      GR_IM      tIMbre    im
	23      GR_UM      parfUM    um
	24      GR_ON      fONdue    on
	25      GR_J       Ja        j
	26      GR_L       Luft      l
	27      GR_RR      Rund      rr
	28      GR_R       waR       r
___________________________________________


Table 4-5 Phonemic Symbols - German Continued...  
___________________________________________
ASCKY   DT      DT         Example   Arpabet
	index   internal
___________________________________________
	29      GR_H       Hut       h
	30      GR_M       Mut       m
	31      GR_N       NeiN      n
	32      GR_NG      riNG      ng
	33      GR_EL      nabEL     el
	34      GR_EM      grossEM   em
	35      GR_EN      badEN     en
	36      GR_F       Fall      f
	37      GR_V       Was       v
	38      GR_S       meSSen    s
	39      GR_Z       doSe      z
	40      GR_SH      SCHule    sh
	41      GR_ZH      Genie     zh
	42      GR_CH      niCHt     ch
	43      GR_KH      noCH      kh
	44      GR_P       Park      p
	45      GR_B       Ball      b
	46      GR_T       Turm      t
	47      GR_D       Dort      d
	48      GR_K       Kalt      k
	49      GR_G       Gast      g
	50      GR_Q       Be_amtet  q
	51      GR_PF      PFerd     pf
	52      GR_TS      Zahl      ts
	53      GR_DJ      Gin       dj
	54      GR_TJ      maTSCH    tj
	55      GR_KS      Extra     ks
___________________________________________


Table 4-6 Phonemic Symbols - French
___________________________________________
ASCKY   DT      DT         Example   Arpabet
	index   internal
___________________________________________
_       0       SIL       (silence)  _
	1       FR_A                 a
	2       FR_A3                a3
	3       FR_E2                e2
	4       FR_AU                au
	5       FR_E                 e
	6       FR_E1                e1
	7       FR_EU                eu
	8       FR_I                 i
	9       FR_O                 o
	10      FR_O6                o6
	11      FR_OU                ou
	12      FR_U                 u
	13      FR_AN                an
	14      FR_IN                in
	15      FR_ON                on
	16      FR_UN                un
	17      FR_AP                ap
	18      FR_L                 l
	19      FR_R                 r
	20      FR_W                 w
	21      FR_WU                wu
	22      FR_Y                 y
	23      FR_CH                ch
	24      FR_F                 f
	25      FR_J                 j
	26      FR_RX                rx
	27      FR_S                 s
	28      FR_V                 v
___________________________________________


Table 4-6 Phonemic Symbols - French Continued... 
___________________________________________
ASCKY   DT      DT         Example   Arpabet
	index   internal
___________________________________________
	29      FR_Z                 z
	30      FR_B                 b
	31      FR_D                 d
	32      FR_G                 g
	33      FR_K                 k
	34      FR_P                 p
	35      FR_T                 t
	36      FR_GN                gn
	37      FR_M                 m
	38      FR_N                 n
	39      FR_NG                ng
	40      FR_SG                sg
___________________________________________


4.2

Stress and Syntactic Symbols -

Table 4-7 and Table 4-8 list the stress and syntactic symbols 
supported by the DECtalk TTS engine. Phoneme interpretation must be turned 
on for the stress and syntactic symbols to work. Refer to the Phoneme 
Interpretation command description in Chapter 2 for more information.

Table 4-7 Stress Symbols
_______________________________________________________   
Symbol  Name               Indicates             Unicode
_______________________________________________________
       Apostrophe         primary stress        U+27
       Grave accent       secondary stress      U+60
"       Quotation mark     emphatic stress       U+22
/       Slash              pitch rise            U+2F
\       Backslash          pitch fall            U+5C
_______________________________________________________

Table 4-8 Syntactic Symbols
_______________________________________________________
Symbol  Name               Indicates             Unicode
_______________________________________________________
-       Hyphen             syllable boundary     U+2D
*       Asterisk           morpheme boundary     U+2A
#       Number sign        compound nouns        U+23
(       Open parenthesis   beginning of
			   prepositional phrase  U+28
)       Close parenthesis  beginning of a verb   
			   phrase
,       Comma              clause boundaries     U+2C
.       Period             period                U+2E
?       Question mark      question mark         U+2F
!       Exclamation point  exclamation point     U+21
+       Plus sign          new paragraph         U+2B
	Space              word boundary         U+20
_______________________________________________________


4.3

Phonemes Listed in Unicode Sequence -

Table 4-9 U.S. English Phonemes in Unicode Sequence
______________________________________________________________________
Uni-   Unicode Character     ASCKY  DT     DT        Example   Arpabet
code   Name                         index  internal
______________________________________________________________________      
U+20   Space                                         Word      <space>
						     boundary
U+21   Exclamation point
U+22   Quotation mark                               Hello   
U+23   Number sign           #
U+27   Apostrophe                                   rehd
U+28   Left parenthesis      (
U+29   Right parenthesis     )
U+2A   Asterisk              *
U+2B   Plus sign             +
U+2C   Comma                 ,
U+2D   Hyphen                -
U+2E   Full stop             .                       Syllable  -
						     break
U+2F   Solidus               /
U+3F   Question mark         ?
U+52   Latin capital         R      26     US_R      Rat       r
       letter R
U+5C   Reverse solidus       \
U+5F   Low line              _      0      US_SIL   (silence)  _
U+61,  Latin small letter    A      7      US_AY     bUY       ay
U+26A  A + Latin small
       capital I
U+61,  Latin small letter    W      8      US_AW     brOW      aw
U+28A  A + Latin small            
       capital I
U+62   Latin small letter B  b      46     US_B      Bad       b
	 
U+64,  Latin small letter    D      35     US_DZ     WiDth     dz
U+32F  D + combining inverted
       breve below      
U+64   Latin small letter D  d      48     US_D      Dad       d         
U+65   Latin small letter E  e      3      US_EY     bAY       ey         
U+66   Latin small letter F  f      37     US_F      Fat       f         
______________________________________________________________________


Table 4-9 U.S. English Phonemes in Unicode Sequence Continued... 
______________________________________________________________________
Uni-   Unicode Character     ASCKY  DT     DT        Example   Arpabet
code   Name                         index  internal
______________________________________________________________________
U+67   Latin small letter G  g      50     US_G      Game      g
U+68   Latin small letter H  h      28     US_HX     Had       hx
U+69,  Latin small letter 
U+2B4  I + modifier letter   I      19     US_IR     pEEr      ir
       small turned R
U+69   Latin small letter I  i      1      US_IY     bEAn      iy
U+6A,  Latin small letter J  Y      16     US_YU     cUte      yu
U+75   + Latin small 
       letter U
U+6A   Latin small letter J  Y      25     US_Y      Yank      yx
U+6B   Latin small letter K  k      49     US_K      Cad       k
U+6C,  Latin small letter L  L      34     US_EL     dangLe    el
U+329  + combining vertical 
       line below
U+6C   Latin small letter L  l      27     US_LL     Lad       l
U+6D   Latin small letter M  m      31     US_M      Mad       m
U+6E,  Latin small letter N  N      36     US_EN     burdeN    en
U+329  + combining vertical 
       line below
U+6E   Latin small letter N  n      32     US_N      Nat       n
U+6F,  Latin small letter O  o      11     US_OW     nO        ow
U+28A  + Latin small letter
       upsilon
U+70   Latin small letter P  p      45     US_P      Pat       p
U+73   Latin small letter S  s      41     US_S      Sap       s
U+74   Latin small letter T  t      47     US_T      Tack      t
U+74,  Latin small letter T  Q      52     US_TX     baTTen    tx
U+294  + Latin letter 
       glottal stop
U+75   Latin small letter U  u      14     US_UW     bOOn      uw
U+76   Latin small letter V  v      38     US_V      Vat       v
U+77   Latin small letter W  w      24     US_W      Why       w
U+7A   Latin small letter Z  z      42     US_Z      Zap       z
U+E6   Latin small           @      5      US_AE     pAt       ae
       letter AE 
U+F0   Latin small           D      40     US_DH     THen      dh
       letter Eth
______________________________________________________________________


Table 4-9 U.S. English Phonemes in Unicode Sequence Continued... 
______________________________________________________________________
Uni-   Unicode Character     ASCKY  DT     DT        Example   Arpabet
code   Name                         index  internal
______________________________________________________________________
U+14B  Latin small letter    G      33     US_NX     baNG      nx
       Eng
U+251, Latin small letter    a      21     US_AR     bARn      ar
U+2B4  Alpha + modifier 
       letter small turned R
U+251  Latin small letter    a      6      US_AA     pOt       aa
       Alpha
U+254, Latin small letter    O      12     US_OY     bOY       oy
U+26A  open O + Latin small 
       letter capital I
U+254, Latin small letter    c      22     US_OR     bOrn      or  
U+2B4  open O + modifier 
       letter small turned R
U+254  Latin small letter O  c      10     US_AO     bOUght    ao
U+259  Latin small letter    x      17     US_AX     About     ax
       Schwa
U+25A  Latin small letter    R      15     US_RR     anothER   rr
       Schwa with hook
U+25B  Latin small letter    E      4      US_EH     pEt       eh
       open E
U+268  Latin small letter I  |      18     US_IX     kissEs    ix
       with stroke
U+26A  Latin small letter    I      2      US_IH     pIt       ih
       Capital I
U+26B  Latin small letter I  l      30     US_LX     untiL     lx
       with middle tilde
U+279  Latin small letter    R      29     US_RX     coRe      rx
       turned R with hook
U+283  Latin small letter    S      43     US_SH     SHeep     sh
       Esh
U+28A, Latin small letter    U      23     US_UR     pOOr      ur
U+2B4  Upsilon + modifier 
       letter small turned R
U+28A  Latin small letter    U      13     US_UH     pUt       uh
       Upsilon
U+28C  Latin small letter    ^      9      US_AH     pUtt      ah
       turned V
U+292  Latin small letter    Z      44     US_ZH     meaSure   zh
       Ezh
U+294  Latin letter glottal  q      53     US_Q      we eat    q
       stop
_____________________________________________________________________


Table 4-9 U.S. English Phonemes in Unicode Sequence Continued...
______________________________________________________________________
Uni-   Unicode Character     ASCKY  DT     DT        Example   Arpabet
code   Name                         index  internal
______________________________________________________________________
U+2A4  Latin small letter    J      55     US_JH     Jeep      jh
       Dezh digraph
U+2A7  Latin small letter    C      54     US_CH     CHeap     ch
       Tesh digraph
U+2C8  Modifier letter 
       vertical line         
U+28CC Modifier letter low   `
       vertical line
U+3B8  Greek small letter    T      39     US_TH     THin      th
       Theta
_____________________________________________________________________


4.4

Pitch and Duration of Tones -

The DECtalk TTS engine can be used to sing songs or make various sounds 
associated with singing and musical tones. Table 4-11 provides the 
pitch numbers, associated notes, and frequencies you need to code a 
phonemic sequence to produce musical sounds.

Figure 4-1 is the code for the song, Happy Birthday. The command 
syntax for coding musical sequences is found in Table 4-10. You can use 
the phonemic table for your language (see Table 4-1 through Table 4-6) 
to decode the phoneme symbols.

Table 4-10 Phoneme Syntax for Singing
--------------------------------------------------------------------
  SYNTAX:       [ phoneme <duration, pitch number>]
  OPTIONS:      none
  PARAMETERS:   duration - Tone duration in milliseconds.
		pitch - Pitch number.
  
  DEFAULT:      none
  EXAMPLE:     See Figure 4-1


Figure 4-1 DECtalk TTS Engine Singing Happy Birthday
----------------------------------------------------------------------------
[:phoneme arpabet speak on]
[hxae<300,10>piy<300,10> brr<600,12>th<100>dey<600,10>
tuw<600,15> yu<1200,14>_<120>]
[hxae<300,10>piy<300,10> brr<600,12>th<100>dey<600,10>
tuw<600,17> yu<1200,15>_<120>]
[hxae<300,10>piy<300,10> brr<600,22>th<100>dey<600,19>
dih<600,15>r deh<600,14>ktao<600,12>k_<120>_<120>]
[hxae<300,20>piy<300,20> brr<600,19>th<100>dey<600,15>
tuw<600,17> yu<1200,15>]


Table 4-11 Tone Table
__________________________________________________
Pitch Number    Note     Pitch     Vocal Ranges
__________________________________________________
1               C2       65
2               C#       69
3               D        73
4               D#       77
5               E        82        B
6               F        87        A
7               F#       92        S
8               G        98        S  B
9               G#       103          A
10              A        110          R
11              A#       116          I
12              B        123          T
13              C3       130          O  T
14              C#       138          N  E
15              D        146          E  N
16              D#       155             O
17              E        164             R
18              F        174                A
19              F#       185                L
20              G        196                T
21              G#       207                O
22              A        220
23              A#       233
24              B        247                   S
25              C4       261                   O
26              C#       277                   P
27              D        293                   R
28              D#       311                   A
29              E        329                   N
30              F        348                   O
31              F#       370
32              G        392
33              G#       415
34              A        440
35              A#       466
36              B        494
37              C5       523
__________________________________________________


4.5

Homographs -

Homographs are two or more words that have the same spelling but are 
pronounced differently. Homographs are often different in terms of 
which syllable is accented. For example, if permit is a noun, the 
accent is on the first syllable (permit); if, however, the word is used 
as a verb, the accent is on the second syllable (permit). This 
distinction often makes a great deal of difference in understanding 
DECtalk when it is speaking such words in connected discourse.

The default pronunciation is the more frequent form. In the event the 
alternate pronunciation is needed, you can insert the correct phonetics 
from the homograph index below.

Use the [:pronounce alternate] command before a word to obtain an 
alternative pronunciation for the word. For example, the primary 
pronunciation of the word bass is beys, as in bass guitar, while the 
alternate pronunciation, denoted by [:pronounce alternate], is baes, 
as in the fish, bass.

This section lists the homograph phonetics in alphabetical groups, as 
follows:

Table 4-12 Homograph Phonetics - (A)
____________________________________________________________
Spelling      Primary               Alternate
____________________________________________________________
abstract       aeb s t r aek t     ae b s t r  aek t
abuse         axb y  u z           axb y  u s
addict        ax d  ihk t           ae d ihk t
advocate       aed v axk eyt        aed v ax k axt
affix          aef ihk s           axf  ihk s
ally           ael ay              axl  ay
alternate      aol t rrn ax t       ao l t rrn ey t
animate        aen ihm eyt         aen ih m ax t
annex          aen ehk s           axn  ehk s
appropriate   axp r  owp r iyaxt   axp r  owp r iy eyt
arithmetic    axr  ihthm axt ixk   aer ixthm  eht ixk
articulate    aar t  ihk yxel eyt  aar t  ih k yxel axt
associate     axs  owshiyeyt       axs  owshiyaxt
attribute     axt r  ihbyut         aet r ixbyut
august         aog axs t           aog  ahs t
____________________________________________________________


Table 4-13 Homograph Phonetics - (B-C)
____________________________________________________________
Spelling      Primary               Alternate
____________________________________________________________
bass          b  eys               b  aes
baton         b axt  aon           b  aet ax n
close         k l  owz             k l  ows
combat        k axm b  aet         k  aam b ae t
combine       k axm b  ayn         k  aam b ayn
compact       k axm p  aek t       k  aam pae k t
complex       k  aam p l ehk s     k axm p l  ehk s
compound      k  aam paw n d       k axm p  aw n d
compress      k ax m p r  ehs      k  aam p r ehs
concert       k  aan s rrt         k axn s  rrt
conduct       k axn d  ahk t       k  aa n d ahk t
confederate   k axn f  ehd rrixt  k axn f  ehd rriht
	      rreyt
confine       k axn f  ayn         k  aan f ayn
conflict      k  aan f l ixk t     k axn f l  eyk t
conglomerate  k axnxg l  aam rixt  k axnxg l  aam rreyt
console       k  aan s owl         k axn s  owl
construct     k axn s t r  ahk t   k  aan s t r axk t
content       k  aan t ehn t       k axn t  ehn t
contest       k  aan t ehs t       k axn t  ehs t
contract      k  aan t rae k t     k axn t r  aek t
contrast      k  aan t r aes t     k axn t r  aes t
converse      k  aan v rrs         k axn v  rrs
convert       k axn v  rrt         k  aan v rrt
convict       kax n v  ihk t       k  aan vih k t
coordinate    k ow aor d en eyt    kow aor d ixn axt
____________________________________________________________


Table 4-14 Homograph Phonetics - (D-G)
____________________________________________________________
Spelling      Primary                 Alternate
____________________________________________________________
decrease      d iyk r  iys           d  iyk r iys
defect        d ax f  ehk t          d  iyf ehk t
delegate      d  ehl ixg axt         d  ehl ixg  ey t
deliberate    d axl  ihb rraxt       d axl  ihb rreyt
desert        d  ehz rrt             d ixz  rrt
desolate      d  ehs el ixt          d  eh sel yet
Diffuse       dix f  yuw s           d ix f  yuw z
digest        d  ayjhehs t           d ayjh ehs t
discharge     d ixs ch arjh          d  his charjh
discount      d  ihs kaw n t         d his k  awn t
dove          d  owv                 d  ahv
duplicate     d  uwp l ixk eyt       d  uwp lixk axt
elaborate     axl  aeb rraxt         axl  aeb rreyt
estimate       ehs tix m eyt          ehs tix m axt
excerpt       ehksrrpt               ehksrrpt
excuse        ixk s k  yuz           eh k s k yus
expatriate    ehk s p ' yet riy axt   ehk s p ' ey t riieyt
exploit       ixk s p l ' oyt         ' ehk s p loy t
export        ehk s p ' ort           'ehk s por t
extract       ehk s t r ' aek t       'eh k s t raek t
ferment       frr m ' ehn t           f ' rrm eh n t
frequent      f r ' iyk wix n t       f riy k w ' eyn t
geminate      jh ' ehm ixn axt        jh ' ehm ixn eyt
graduate      g r ' aejhuweyt         g r ' aejhuwaxt
____________________________________________________________


Table 4-15 Homograph Phonetics - (I-L)
____________________________________________________________
Spelling      Primary                 Alternate
____________________________________________________________
impact         ihm paek t            ixm p  aek t
implant       ihm p l  aen t          ihm p l aen t
import         ihm p ort             ihm p  ort
imprint        ihm p r ihnt          ihm p r  ihn t
incense       ixn s  ehn s          ihn s ehn s
incline       ixn k l  ayn            ihn k l ayn
increase      ihn k r  iys            ihn k r iys
insert        ihn s  rrt              ihn s rrt
insult        ihn s  ahl t            ihn s axl t
interchange    ihn t rr ch eyn jh    ihn t rr ch  eyn jh
intimate       ihn t axm axt          ihn t axm eyt
invalid       ixn v  ael ixd          ihn v axl ixd
just          jh ixs t                jh  ahs t
lead          l  iyd                 l  ehd
live          l  ihv                 l  ayv
____________________________________________________________


Table 4-16 Homograph Phonetics - (M-P)
____________________________________________________________
Spelling      Primary                 Alternate
____________________________________________________________
minute        m  ih nix t            may n  uwt
miscount      m  ihs kaw n t         mih s k  awn t
misprint      m  ihs p r in t        mih s pr  int
misuse        mix s  yuz             mix s  yus
moderate      m  aad rraxt           m  aad rreyt
object         aa b jheht            ax b jh  ehkt
overrun       ow v rr rahn           ow v r rrahn
perfect       p  rr f ixk t          prrf  ehk t
permit        prr m  iht             p  rr miht
pervert       p rrv  rrt             p  rrv rrt
polish        p  aal hish            p  owl ixsh
postulate     p  aas cheleyt         p  aas chelaxt
predicate     p r  ehd ixk eyt       p r  ehd ixk axt
predominate   p r ixd  aam ixn eyt   p r ixd  aam ixn axt
present       p riy z  ehn t         p r  ehz axn t
proceed       p r axs  iyd           p r  ows iyd
produce       p r axd  uws           p r  aad uws
progress      p r  aag r ehs         p rax g r  eh s
project       p r  aajh ehk t        p r axjh  ehk t
protest       p r  owt ehs t         p r owt  ehs t
____________________________________________________________


Table 4-17 Homograph Phonetics - (R)
____________________________________________________________
Spelling      Primary                 Alternate
____________________________________________________________
read          r  iyd                 r  ehd
reading       r  iyd ixnx            r  ehd ixnx
rebel         r  ehb el              rix b  ehl
recall        rix k  aol             r  iyk aol
recap         riy k  aep             r  iyk aep
recess        r  iys ehs             r iys  ehs
record        r  ehk rrd             r ixk  ord
recount       r iyk  awn t           r  iyk awn t
refill        r  iyf ihl             r iyf  ihl
refresh       r iyf r  ehsh          r  iyf r ehsh
refund        r iyf  ahn d           r  iyf ahn d
refuse        r ixf  yuz             r  ehf yus
reject        rixjhehkt              riyjhehkt
relapse       r  iyl aep s           r ixl  aep s
relay         r  iyl ey              r ixl  ey
remake        r  iym eyk             r iym  eyk
rerun         r  iy * rahn           r iy * r  ahn
research      r  iys rrch            r iys  rrch
resume        r iy | z  uwm          r  ehz axm ey
retake        r iyt  eyk             r  iyt eyk
rewrite       r iy r  ayt            r  iy * r ayt
____________________________________________________________


Table 4-18 Homograph Phonetics - (S-W)
____________________________________________________________
Spelling      Primary                 Alternate
____________________________________________________________
segment       s  ehg m ixn t         s ehg m  ehn t
separate      s  ehp axr eyt         s  ehp axr axt
sow           s ow                   s aw
subject       s  ahb jhehk t         s axb jh  ehk t
sublet        s axb l  eht           s axb l  eht
subordinate   s axb  ord enaxt       s axb  ord eneyt
survey        s  rr vey              s rr v  ey
suspect       s  ahs peh k t         s ax s p  eh k t
syndicate     s  ihn dix kix t       s  ihn dix key t
tear          t  er                  t  ir
torment       t orm  ehn t           t  orm ehn t
transform     t r aen s f  orm       t r  aen s f orm
transplant    t r aen s p l  aen t   t r  aen s p l aen t
transport     t r aen s p  ort       t r  aen s p ort
upset         axp s  eht              ah p she t
use           y  uwz                 y  uws
wind          w  ihn d               w  ayn d
wound         w  awn d               w  uwn d
___________________________________________________________



		   Chapter 5  -  Customizing a DECtalk Voice

The built-in voices of the DECtalk USB provide an adequate selection 
for most applications. However, if you have a special application requiring 
a monotone or unusual voice, you can use the Design Voice command to 
modify the options provided in this section to design your own voice. 
For information on all other DECtalk commands, refer to the previous sections.

Topics Include:
  Design Voice Command [:dv]
  Definitions of DECtalk Voices
  Changing Gender and Head Size
  Changing Voice Quality
  Changing Pitch and Intonation
  Changing Relative Gains and Avoiding Overloads
  Saving Changes as Vals Voice
  Summary of Design Voice Options


5.1

Design Voice [:dv]

The nine built-in voices of the DECtalk USB are distinguished from one 
another by a large set of speaker-definition options. Note that there 
is a tenth voice, called Val. Val is initialized with the same voice as 
Paul, but can be used to save voice changes. Unlike the nine built-in 
voices that can be modified but not saved, Val can be used to store 
voice changes during normal operation. Keep in mind that these
modifications are lost when a system reset occurs.

The DECtalk USBs' internal text to speech engine supports many speaker-definition 
options that can be modified. However, please be aware that approximating all 
the variations that can characterize a speaker -- sex, age, head size and 
shape, larynx size and behavior, pitch range, pitch and timing habits, 
dialect, and emotional state can be very time-consuming.

The Design Voice [:dv] command introduces the speaker-definition options 
and parameters that can be entered as a string or one at a time.

The following sections discuss speech production, acoustics, and 
perception. Some of the information is relatively technical, but the 
examples should make it possible for all developers to modify any 
option effectively and listen to the results.

Table 5-1 [:dv] Command Options
SYNTAX:      [:dv XX YY]
OPTIONS:     ap          Average pitch, in Hz
	     as          Assertiveness, in %
	     b4          Fourth formant bandwidth, in Hz
	     b5          Fifth formant bandwidth, in Hz
	     bf          Baseline fall, in Hz
	     br          Breathiness, in decibels (dB)
	     f4          Fourth formant resonance frequency, in Hz
	     f5          Fifth formant resonance frequency, in Hz
	     g1          Gain of cascade formant resonator 1, in dB
	     g2          Gain of cascade formant resonator 2, in dB
	     g3          Gain of cascade formant resonator 3, in dB
	     g4          Gain of cascade formant resonator 4, in dB
	     g5          Loudness of the voice, in dB
	     gf          Gain of frication source, in dB
	     gh          Gain of aspiration source, in dB
	     gn          Gain of nasalization, in dB
	     gv          Gain of voicing source, in dB
	     hr          Hat rise, in Hz
	     hs          Head size, in %
	     la          Laryngealization, in %
	     lx          Lax breathiness, in %
	     nf          Number of fixed samples of open glottis
	     pr          Pitch range, in %
	     qu          Quickness, in %
	     ri          Richness, in %
	     sm          Smoothness, in %
	     sr          Stress rise, in Hz
	     sx          Sex 1 (male) or 0 (female)
	     save        Save the current speaker-definition options as
			 Vals voice.
PARAMETERS:  See the individual options for detailed information about
	     valid parameter values
EXAMPLES:    [:np][:dv ap 100] Change Pauls average pitch to be 100.


5.2

Table 5-2 Speaker Definitions for All DECtalk Software Voices
___________________________________________________________________________
Param  Paul  Harry  Frank  Dennis  Betty  Ursula  Wendy  Rita  Kit
___________________________________________________________________
ap     122   89     155    110     208    240     200    106   306
as     100   100    65     100     35     100     50     65    65
b4     260   200    280    240     260    260     400    250   2048
b5     330   240    300    280     2048   2048    2048   2048  2048
bf     18    9      9      9       0      8       0      0     0
br     0     0      50     38      0      0       55     46    47
f4     3300  3300   3650   3200    4450   4450    4500   4000  2500
f5     3650  3850   4200   3600    2500   2500    2500   2500  2500
g1     68    71     63     75      69     67      69     69    69
g2     60    60     58     60      65     65      62     72    69
g3       48    52     56     52      50     51      53     48    52
g4       64    62     66     61      56     58      55     54    50
g5     86    81     86     84      81     80      83     83    73
gf     70    70     68     68      72     70      70     72    72
gh     70    70     68     68      70     70      68     70    70
gn     74    73     75     76      72     80      75     73    71
gv     65    65     63     63      65     65      51     65    65
hr     18    20     20     20      14     20      20     20    20
hs     100   115    90     105     100    95      100    95    80
la     0     0      5      0       0      0       0      4     0
lx     0     0      50     70      80     50      80     0     75
nf     0     10     0      10      0      10      10     0     0
pr     100   80     90     135     240    135     175    80    210
qu     40    10     0      50      55     30      10     30    50
ri     70    86     40     0       40     100     0      20    40
sm     3     12     46     100     4      60      100    24    5
sr     32    30     22     22      20     32      22     32    22
sx     1     1      1      1       0      0       0      0     0
__________________________________________________________________


5.3

Changing Gender and Head Size -

Six speaker-definition options control the size and shape of the 
head. These options are listed in Table 5-3.

Table 5-3 Head Size and Shape Options
    sx     Sex 1 (male) or 0 (female)
    hs     Head size, in %
    f4     Fourth formant resonance frequency, in Hz
    f5     Fifth formant resonance frequency, in Hz
    b4     Fourth formant bandwidth, in Hz
    b5     Fifth formant bandwidth, in Hz


5.4

Sex, sx -

Male and female voices differ in many ways, including head size, 
pharynx length, larynx mass, and speaking habits such as degree 
of breathiness, liveliness of pitch, choice of articulatory target 
values, and speed of articulation. Some of these differences are under 
the control of a single option, sx, the sex of the speaker. Speakers 
Paul, Harry, Frank, and Dennis are male (sx = 1), while speakers Betty,
Rita, Ursula, Wendy, and Kit are female (sx = 0). Actually, Kit can be 
male or female because children of both sexes younger than 10 years old 
have similar voices.

Changing the Sex (sx) option causes the DECtalk TTS engine to access a 
different (male or female) table of target values for formant 
frequencies, bandwidths, and source amplitudes. The male and female 
tables are patterned after two individuals who were judged to have 
pleasant, intelligible voices. The built-in voices of DECtalk Software
are simply scaled transformations of Paul and Betty, the two basic 
voices.

You can change the sex of any DECtalk voice by making the 
voice current and then modifying the sx option. For example, the 
following command gives Paul some of the speaking characteristics of a 
woman. (The sx option does not change the average pitch or breathiness, 
so a peculiar combination of simultaneous male and female traits 
results from this sx change.)

    [:np][:dv sx 0] Am I a man or woman?

The sx option can also be specified as m or f with the commands 
[:dv sx m] or [:dv sx f].

Note If you change the sex of the voice, some phonemes might cause the DECtalk 
speech engine software filters to overload, producing 
a squawk. The modification of certain options such as f4, f5, and g1 can 
help to correct this problem.


5.5

Head Size, hs -

The Head size (hs) option is specified as the average size for an adult 
man (if sx = 1) or an adult woman (if sx = 0). A head size of 100% is 
normal or average for a given sex, but people can differ significantly 
in this characteristic. Head size has a strong influence on a persons 
voice. Large musical instruments produce low notes, and humans with 
large heads tend to have low, resonant voices. For example, to make
Paul sound like a larger man with a 15% longer vocal tract (and formant 
frequencies that are scaled down by a factor of about 0.85%), use the 
following command:

    [:np][:dv hs 115] Do I sound more like huge Harry this way?

Head size is one of the best variables to use if you want to make 
dramatic voice changes. For example, Paul has a head size of 100, while 
Harrys deep voice is caused in part by a head-size change to 115, or 
15% greater than normal. Decreasing head size produces a higher voice, 
such as in a child or adolescent. Extreme changes in head size, as in 
the following examples, are somewhat difficult to understand.

    [:nh][:dv hs 135] Do I have a swelled head?
    [:nk] I am about 10 years old.
    [:nk][:dv hs 65] Do I sound like a six year old?

Note:
     Extreme changes in head size can cause overloads, as well as 
     difficulties in understanding the speech. The modification of certain 
     options such as f4, f5, and g1 can help to correct this problem.


5.6

Higher Formants, f4, f5, b4, and b5 -

A male voice typically has five prominent resonant peaks in the 
spectrum (over the range from 0 to 5 kHz), a female voice typically has 
only four (because of a smaller head size), and a child has three. If 
fourth and fifth formant resonances exist for a specific voice, they are 
fixed in frequency and bandwidth characteristics. These characteristics 
are specified in Hz by the options f4, f5, b4, and b5.

If a higher formant does not exist, the frequency and bandwidth of the 
speaker definition are set to special values that cause the resonance 
to disappear. To make a resonance disappear, the frequency is set to 
above 5500 Hz and the bandwidth is set to 5500 Hz. (This disables the 
formant filter.) This is what has been done to the fourth and fifth 
formants for Kit.

The permitted values for the f4 and f5 options have fairly complicated 
restrictions. Violating these restrictions can cause overloads and 
squawks. The following restrictions apply to cases where a higher 
formant exists:

  The f5 option must be at least 300 Hz higher than f4.
  If sx is 1 (male), f4 must be at least 3250 Hz.
  If sx is 0 (female), f4 must be at least 3700 Hz.
  If hs is not 100, the preceding values should be multiplied by 
   (hs / 100).

These higher formants produce peaks in the spectrum that become more 
prominent if the b4 and b5 options are smaller, and if the f4 and f5 
options are closer together. The limits placed on the b4 and b5 options 
should ensure that no problems occur. However, smaller values for 
bandwidths may produce an overload in the synthesizer. You can correct 
these overloads by increasing the bandwidths or by changing the gain 
control, g1.


5.7

Changing Voice Quality -

Six speaker-definition options control aspects of the output of the 
larynx, which, in turn, control voice quality. These options are 
listed in Table 5-4.

Table 5-4 Voice Quality Options
------------------------------------------------    
    br     Breathiness, in decibels (dB)
    lx     Lax breathiness, in %
    sm     Smoothness, in %
    ri     Richness, in %
    nf     Number of fixed samples of open glottis
    la     Laryngealization, in %


5.8

Breathiness, br -

Some voices can be characterized as breathy. The vocal folds vibrate 
to generate voicing and breath noise simultaneously. Breathiness is a 
characteristic of many female voices, but it is also common under 
certain circumstances for male voices.

The range of the Breathiness (br) option is from 0 dB (no breathiness) 
to 70 dB (strong breathiness). By experimenting, you can learn what 
intermediate values sound like. For example, to turn Paul into a 
breathy, whispering speaker, use the following commands:

    [:np][:dv br 55 gv 56] Do I sound more like Dennis now? 

This voice is not as loud as the others, because of the simultaneous 
decrease in the gain of voicing, gv, but it is intelligible and human 
sounding.


5.9

Lax Breathiness, lx -

The br option creates simultaneous breathiness whenever voicing is 
turned on. Another type of breathiness occurs only at the ends of 
sentences and when going from voiced to voiceless sounds. This type of 
breathiness is controlled by the Lax breathiness (lx) option in 
percentage values.

A nonbreathy, tense voice would have the lx option set to 0, while a 
maximally breathy, lax voice would be set to 100. The difference between 
these two voices is not great, but you can hear it if you listen 
closely.


5.10

Smoothness, sm -

The Smoothness (sm) option refers to vocal fold vibrations. The vocal 
folds meet at the midline, as they do in normal voicing, but they do 
not slam together forcefully to create a very sudden cessation of 
airflow.

The software speech engine of the DECtalk USB uses a 
variable-cutoff, gradual low-pass filter to 
model changes to smoothness. The range of sm is from 0% (least smooth 
and most brilliant) to 100% (most smooth and least brilliant). The 
voicing source spectrum is tilted so that energy at higher frequencies is 
attenuated by as much as 30 dB when smoothness is set to the 
maximum but is not attenuated at all when smoothness is set to 0.

Professional singing voices that are trained to sing above an orchestra 
are usually brilliant, while anyone who talks softly becomes breathy 
and smooth. To synthesize a breathy voice, having the sm option set to 
50 or more is good. Changes to smoothness do not have a great effect on 
perceived voice quality.


5.11

Richness, ri -

The Richness (ri) option is similar to smoothness and brilliance except 
that the spectral change occurs at lower frequencies. The spectral 
change difference is because of a different physiological mechanism. 
Brilliant, rich voices carry well and are more intelligible in noisy 
environments, while smooth, soft voices sound more friendly. For 
example, the following command produces a soft, smooth version of
Pauls voice:

    [:np][:dv ri 0 sm 70] Do I sound more mellow? 

The following command produces a maximally rich and brilliant 
(forceful) voice:

    [:np][:dv ri 90 sm 0] Do I sound more forceful?

Smoothness and richness are usually negatively correlated when a 
speaker dynamically changes laryngeal output. The sm and ri options do 
not influence the speakers identity very much.


5.12

Nopen Fixed, nf -

The number of samples in the open part of the glottal cycle is 
determined not only by the ri option, but also by a second option, nf. 
The Nopen Fixed (nf) option is the number of fixed samples in the open 
portion of the glottal cycle.

Most speakers adjust the open phase to be a certain fraction of the 
period, and this fraction is determined by the ri option. Other 
speakers keep the open phase fixed in duration when the overall period 
varies. To simulate this behavior, set the ri option to 100 and adjust 
the nf option to the desired duration of the open phase. The shortest 
possible open phase is 10 (1 ms), and the longest is three quarters of 
the period duration (about 70 for a male voice).


5.13

Laryngealization, la -

Many speakers turn voicing on and off irregularly at the beginnings 
and ends of sentences, which gives a querulous tone to the voice. This 
departure from perfect periodicity is called laryngealization or creaky 
voice quality.

The Laryngealization (la) option controls the amount of laryngealization, 
in the voice. A value of 0 results in no laryngealized irregularity, and 
a value of 100 (the maximum) produces laryngealization at all times. For 
example, to make Betty moderately laryngealized, type the following 
command:

    [:nb][:dv la 20]

The la option creates a noticeable difference in the voice, although it 
is not altogether a pleasant change.


5.14

Changing Pitch and Intonation -

Seven speaker-definition options control aspects of the fundamental 
frequency (f0) contour of the voice. These options are listed in 
Table 5-5.

Table 5-5 Fundamental Frequency Contour Options
--------------------------
bf Baseline fall, in Hz
hr Hat rise, in Hz
sr Stress rise, in Hz
as Assertiveness, in %
qu Quickness, in %
ap Average pitch, in Hz
pr Pitch range, in %


5.15

Baseline Fall, bf -

The Baseline fall (bf ) option in Hz determines one aspect of the 
dynamic fundamental frequency contour for a sentence. If the bf option 
is 0, the reference baseline fundamental frequency of a sentence begin 
and ends at 115 Hz. All rulegoverned dynamic swings in f0 are computed 
with respect to the reference baseline.

Some speakers begin a sentence at a higher f0 and gradually fall as the 
sentence progresses. This falling baseline behavior can be simulated by 
setting the bf option to the desired fall in Hz. For example, setting 
the bf option to 20 Hz causes the f0 pattern for a sentence to begin at 
125 Hz (115 Hz plus half of bf) and to fall at a rate of 16 Hz per 
second until it reaches 105 Hz (115 Hz minus half of bf). The baseline
remains at this lower value until it is reset automatically before the 
beginning of the next full sentence (right after a period, question 
mark, or exclamation point). The rate of fall (16 Hz per second) is 
fixed, regardless of the extent of the fall.

Whenever you include a [ + ] syntactic symbol in the text to indicate 
the beginning of a paragraph, the baseline is automatically set to 
begin slightly higher for the first sentence of the paragraph. While 
baseline fall differs among speakers, it is not a good cue for 
differentiating among them. As long as the fall is not excessive, its
presence or absence is hardly noticeable. See Chapter 4 for a complete 
list of symbols.


5.16

Hat Rise, hr -

The Hat rise (hr) option (nominal hat rises in Hz) and sr option 
(nominal stress impulse rises in Hz) determine aspects of the dynamic 
fundamental frequency contour for a sentence. To modify these values 
selectively, you should understand how the f0 contour is computed as a 
function of lexical stress pattern and syntactic structure of the 
sentence.

A sentence is first analyzed and broken into clauses with punctuation 
and clauseintroducing words to determine the locations of clause 
boundaries. Within each clause, the f0 contour rises on the first 
stressed syllable, stays at a high level for the remainder of the 
clause up to the last stressed syllable, and falls dramatically on the
last stressed syllable. This rise-at-the-beginning and fall-at-the-end 
pattern has been called the hat pattern by linguists, using the analogy 
of jumping from the brim of a hat to the top of the hat and back down 
again.

The hr option indicates the nominal height, in Hz of a pitch rise to a 
plateau on the first stress of a phrase. A corresponding pitch fall is 
placed by rule on the last stress of the phrase. Some speakers use 
relatively large hat rises and falls, while others use a local impulse-
like rise and fall on each stressed syllable. The default hr option
value for Paul is 18 Hz, indicating that the f0 contour rises a nominal 
18 Hz when going from the brim to the top of the hat. To simulate a 
speaker who does not use hat rises and falls, use the command:

    [:dv hr 0] 

Other aspects of the hat pattern are important for natural intonation 
but are not accessible by speaker-definition commands. For example, the 
hat fall becomes a weaker fall followed by a slight continuation rise 
if the clause is to be succeeded by more clauses in the same sentence. 
Also, if unstressed syllables follow the last stressed syllable in a 
clause, part of the hat fall occurs on the very last (unstressed)
syllable of the clause. If the clause is long, DECtalk Software may 
break it into two hat patterns by finding the boundary between the noun 
phrase and the verb phrase.

If the DECtalk TTS engine is in phoneme input mode and you use the pitch rise 
[ / ] and pitch fall [ \ ] symbols, the hr option determines the actual 
rise and fall in Hz. See Chapter 4 for a complete list of symbols.


5.17

Stress Rise, sr -

The Stress rise (sr) option indicates the nominal height, in Hz, of a 
local pitch rise and fall on each stressed syllable. This rise-fall is 
added to any hat rise or fall that is also present. For example, Paul 
has the sr option set to 32 Hz, resulting in an f0 risefall gesture of 
32 Hz over a span of about 150 ms, which is located on the first and
succeeding stressed syllables. However, DECtalk Software rules reduce 
the actual height of successive stress rises and falls in each clause 
and cause the last stress pulse to occur early so that there is time 
for the hat fall during the vowel.

If the sr option is set too low, the speech sounds monotone within 
long phrases. Great changes to the hr and sr options from their default 
values for each speaker are not necessary or desirable, except in 
unusual circumstances.


5.18

Assertiveness, as -

The Assertiveness (as) option, in %, indicates the degree to which the 
voice tends to end statements with a conclusive final fall. Assertive 
voices have a dramatic fall in pitch at the end of utterances. Neutral 
or meek speakers often end a sentence with a slight questioning rise 
in pitch to deflect any challenges to their assertions. A value of 100
is very assertive, while a value of 0 is extremely meek.


5.19

Quickness, qu -

The Quickness (qu) option, in percentage, controls the speed of 
response to a request to change the pitch. All hat rises, hat falls, 
and stress rises can be thought of as suddenly applied commands to 
change the pitch, but the larynx is sluggish and responds only 
gradually to each command. A smaller larynx typically responds more
quickly, so while Harry has a quickness value of 10, Kit has a value 
of 50.

In engineering terms, a value of 10 implies a time constant (time to 
get to 70% of a suddenly applied step target) of about 100 ms. A value 
of 90% corresponds to a time constant of about 50 ms. Lower quickness 
values may mean that the f0 never reaches the target value before a 
new command comes along to change the target.


5.20

Average Pitch, ap, and Pitch Range, pr -

The Average pitch (ap) option (average pitch, in Hz) and the pitch 
range (pr) option (pitch ranges in % of normal range) modify the 
computed values of fundamental frequency, f0, according to the 
formula:

    f0 = ap + (((f0 - 120) * pr) / 100)

If the ap option is set to 120 Hz and the pr option to 100%, there is 
no change to the normal f0 contour that is computed for a typical male 
voice. The effect of a change in the ap option is simply to raise or 
lower the entire pitch contour independently by a constant number of 
Hz, whereas the effect of the pr option is to expand or contract the 
swings in pitch about 120 Hz.

Normally, a smaller larynx simultaneously produces f0 values that are 
higher in average pitch and higher in pitch range by about the same 
factor (the whole f0 contour is multiplied by a constant factor). 
Observing the values assigned to the ap and pr options for each of the 
voices, you can see that the voices rank in average pitch from low 
(Harry) to high (Kit).

Rankings for the pr option are similar, except that Frank has a flat, 
nonexpressive pitch range as compared with his average pitch.

The best way to determine a good pitch range for a new voice is by 
trial and error. You can create a monotone or robot-like voice by 
setting the pitch range to 0. For example, to make Harry speak in a 
monotone at exactly 90 Hz, type the following command.

    [:nh][:dv ap 90 pr 0] I am a robot.

Reducing the pitch range reduces the dynamics of the voice, producing 
emotions such as sadness in the speaker. Increasing the pitch range 
while leaving the average pitch the same or setting it slightly higher 
suggests excitement.

Due to constraints involved in pitch-synchronous updating of other 
dynamically changing options, the fundamental frequency contour that 
is computed by the preceding formula is then checked for values that 
are outside the following limits.

    f0 maximum = 500 Hz
    f0 minimum = 50 Hz

Any value outside this range is limited to fall within the range.

To keep you from exceeding reasonable limits on the options that 
control pitch, certain constraints apply to the values selected. If 
the Design Voice command specifies values outside these limits, the 
value is limited to the nearest listed value before execution.


5.21

Changing Relative Gains and Avoiding Overloads -

Eight speaker-definition options control the output levels of various 
internal resonators. These options are listed in Table 5-6.

Table 5-6 Internal Resonator Options
------------------------------------------------
    gv    Gain of voicing source, in dB
    gh    Gain of aspiration source, in dB
    gf    Gain of frication source, in dB
    gn    Gain of nasalization, in dB
    g1    Gain of cascade formant resonator 1, in dB
    g2    Gain of cascade formant resonator 2, in dB
    g3    Gain of cascade formant resonator 3, in dB
    g4    Gain of cascade formant resonator 4, in dB
    g5    Loudness of the voice, in dB


5.22

Loudness, g5 -

The Loudness of the voice (g5) option is set to about the same 
perceived loudness for each of the predefined voices. The values 
chosen are optimal for telephone conversation and are near the 
maximum value beyond which some phonemes would probably cause an 
overload squawk. A near-maximum value was selected for each predefined 
voice to maximize the signal-to-noise level of the DECtalk software engine.

If you want to decrease the loudness of a voice or temporarily 
increase a phrase that is known not to overload, determine the g5 
option value in dB for the voice in question. Then adjust the voice 
by using the following command:

    [:np][:dv g5 76] I am speaking at about half my normal level.

Because the g5 option value for Paul is 86, this command reduces 
loudness by 10 dB. Perceived loudness approximately doubles (or halves) 
for each 10 dB increment (or decrement) in the g5 option.

Software control over loudness is useful in a loudspeaker application 
where the background noise level in the room might change. For 
example, a vocally handicapped, wheelchair-bound person does not want 
to appear to be shouting in a quiet interpersonal conversation, but he 
or she may want to be able to converse in a noisy room as well.

Note:
     The DECtalk USB comes with volume control so that modification of the 
     g5 option should not be necessary. Using the Volume command 
     is recommended.


5.23

Sound Source Gains, gv, gh, gf, and gn -

Several types of sound sources are activated during speech production: 
voicing, aspiration, frication, and nasalization. The relative output 
levels of these sounds, in dB, are determined by the Gain of voicing 
source (gv) option, the Gain of aspiration source (gh) option, the Gain 
of frication source (gf) option, and the Gain of nasalization (gn) 
option, respectively. The default settings for these options are 
factory preset to maximize the intelligibility of each voice. However, 
changing the settings can be useful in debugging the system or in 
demonstrating aspects of the acoustic theory of speech production. You 
can change the level of one sound source globally. For example, turn 
off frication to hear just the output of the larynx. You might need to 
reduce these options to overcome certain kinds of overloads, but try 
the procedure described in the next section first.


5.24

Cascade Vocal Tract Gains, g1, g2, g3, and g4 -

Changes in head size or other options can sometimes produce overloads 
in the synthesizer circuits. If this occurs, make sure that the f4 and 
f5 options are set to reasonable values. If the squawk remains, you can 
adjust several gain controls in the cascade of formant resonators of 
the synthesizer to attenuate the signal at critical points. These gain 
controls are the Gain of cascade formant resonator (g1 through g4) 
options. These gains can then be amplified back to desired output 
levels later in the synthesis.

Use the following procedure to correct an overload (typically 
indicated by a squawk during part of a word):

1. Synthesize the word or phrase several times to make sure the squawk 
   occurs consistently. Use the same test word each time a change to a 
   gain is made.

2. Determine the default values for the g1 through g4 options for the 
   speaker that overloads.

3. Reduce the g1 option by increments of three until the squawk goes 
   away. When the squawk goes away, note the reduction that was needed. 
   If more than a 10 dB decrement is required, some other option has 
   probably been changed too much. If the squawk does not go away at 
   all, then you may need to reduce the gv option instead of the g1 
   option.

4. Increase the g5 option to return the output to its original level. 
   For example, if the g1 option was reduced by 6 dB, add 6 dB to the 
   g5 option (or to the g4 option if the g5 option is already at a 
   maximum). If incrementing the g5 option causes the squawk to return, 
   then decrease the g5 option slowly until the squawk goes away.

This procedure works in most cases, but using the g2 option rather than 
the g1 option can work better. If you can return the g1 option to its 
factory-preset value and reduce the g2 option instead to make the 
squawk go away, then the signal-to-quantizationnoise level in the g1 
option remains maximized. If you can eliminate the squawk by using the 
g3 or g4 option rather than the g2 option, more of the cascaded 
resonator system can be made immune to quantization noise accumulation.


5.25

Saving Changes as Vals Voice -

A user can change any of the voice characteristics of the current 
speaker by using the options available in the Design Voice command. 
These changes are active only while the current speaker remains 
current. You can save a modified speaker definition in a buffer while 
synthesizing speech with other voices. To save voice changes for use 
after the current speaker has changed, use the save option of the
Design Voice command. These voice changes are saved as the voice of 
Val. The Val voice [:nv] is either male or female, depending on what 
values are stored in the buffer. If you call Val before storing any 
values in the buffer, the DECtalk software speech engine initializes Val voice 
to be the same as that of Paul.


5.26

Save, save -

The Save (save) option of the Design Voice command lets you save 
speakerdefinition options as Vals voice. You can modify any of the 
predefined voices, but you can save the modifications only as Vals 
voice. The following commands store a modified Betty voice in Val and 
then recall the modified voice:

    [:nb][:dv sx m save ] Betty now sounds like a man. Val now has 
    this voice.

    [:nb] Bettys voice is back to normal.

    [:nv] Vals voice sounds like Betty as a man.

Vals voice characteristics are retained until the 
DECtalk USB is reset or a new save is done. You 
must reenter new voice characteristics for Val after performing
a system reset.

Note:
     If you want to use the save option, leave a space between the command 
     option and the trailing bracket; for example, [:dv save ].


5.27

Summary of Design Voice Options

Of the 28 options, only a few cause dramatic changes in the voice. The 
greatest effects are obtained with changes to the hs, ap, pr, and sx 
options, while moderate changes occur when modifying the la and br 
options. To some extent, the DECtalk software speech engine nine predefined 
speakers cover most of the possible voices. However, you might easily find 
ways to slightly improve one of the standard voices.

 

		 Chapter 6  -  Preprocessor Rules for Parsing


6.1

The preprocessor parses text to ensure that the DECtalk TTS engine 
pronounces it correctly and efficiently with respect to its context. Users can 
modify DECtalk preprocessing with the Punctuation inline command. Two sets 
of rules apply to the parsing process:

* Punctuation parsing rules
* General parsing rules


6.2 

Punctuation Parsing Rules -

When the preprocessor encounters punctuation, it interprets each punctuation 
mark (by default) as a guide to speaking the text normally, 
unless you use inline commands to specify otherwise with the 
Punctuation command, [:punct] command.
 
 Interpreting Punctuation Marks as Words
For the [:punct all] command, the preprocessor interprets each 
punctuation mark as a word to be pronounced. For example, the symbol "~" 
is interpreted as the word "tilde," and the symbol "," is interpreted as 
the word "comma."
For the [:punct none] and [:punct pass]  commands, the preprocessor 
interprets the following symbols normally to modify text:
* . 
* , 
* ; 
* : 
* ? 
* ! 
All other punctuation marks are ignored.
 

6.3

Interpreting Punctuation Marks as Punctuation -

For the [:punct some] command, the preprocessor applies the following 
rules:

     * Multiple instances of identical punctuation marks are reduced to 
       a single symbol. For example, --------------- becomes -, and 
	 *************** becomes *.
     
     * Doubly encapsulated items become singly encapsulated. For 
       example, "(intelligent)" and ((intelligent)) become (intelligent).
     
     * Hours and minutes are not altered. For example, 2:43pm 
       becomes two forty-three P M.
     
     * Numerals and decimal numbers are not altered. For example, 
       -3.52 becomes minus three point five two.
     
     * Currency values are interpreted appropriately. For example, 
       -$43,65 becomes minus forty-three dollars and 
       sixty-five cents, and +$123.21 becomes plus one hundred and 
       twenty-three dollars and twenty-one cents.

     * Uppercase single letters followed by periods are interpreted 
       as single letters. For example, U.S.A. becomes U S A.

     * P.M. and p.m. become P M.
     
     * Doubled clause boundary symbols are reduced 
       to the first clause boundary. For example, boom!, becomes boom!
     
     * Commas and hyphens not followed by spaces are changed to be 
       followed by spaces. For example look,look becomes look, look.
 

6.4 

General Parsing Rules -

Rules for parsing numbers and some other items vary according to the 
language being spoken.
 
     Language: German
     Language-specific rules apply to:
	  * Hours and minutes
	  * Dates
	  * Currency
	  * Phone numbers
	  * Compound nouns

     Language: Spanish (Castilian and Latin American)
     Language-specific rules apply to:
	  * Dates
	  * Currency
	  * Phone numbers
	  * Credit cards

     Language: English (UK)
     Language-specific rules apply to:
	  * Dates
	  * Addresses

     Language: English (US, UK)
     Language-specific rules apply to:
	  * Dates
	  * Hours and minutes
	  * Street, avenue, and drive
	  * Numbered street names; for example, 29 42 Street becomes twenty-nine forty-second street
	  * Phone numbers are spoken as digits, with appropriate pauses 
	  * Dr. becomes doctor
	  * St. becomes saint
	  * Two-letter state names are pronounced in full; for example MA 01749 becomes Massachusetts zero one seven four nine
	  * Postal zip codes within a mail address are spoken one digit at a time
	  * URL addresses are spoken one character at a time
	  * File names are spoken one character at a time
	  * In compound words, prefixes may be broken apart from the 
	    second word
	  * Days of the week
	  * Directions on the compass are spoken in full; for 
	    example 30 W becomes thirty west
	  * Roman numerals following a name are spoken as ordinal numbers; for example John Doe III becomes John Doe the third
	  * Credit card numbers are spoken appropriately; for example, 
	    6011 4134 3621 4172 becomes 
	    six zero one one, four one three four, three six two one, four one seven two.
	  * In a word written with mixed uppercase and lowercase letters, 
	    each uppercase letter begins a new word; for example, 
	    TextToSpeech becomes text to speech
	  * Combinations of numbers and letters are broken into numbers 
	    and individual letters; for example two34five becomes 
	    T W O thirty-four F I V E; XF302QB becomes XF three hundred and two QB



				  GLOSSARY

allophone
     A positional or free variant of a phoneme.

arpabet
     A special phonetic alphabet used to write phonemes and syllables.  

clause boundary
     The natural boundary between two or more clauses in a sentence that helps 
     the listener easily separate the sentence into its component 
     parts. Commas, periods, exclamation points, question marks, 
     semi-colons, and colons are symbols used to indicate clause boundaries.

clause mode
     The normal mode in which the DECtalk TTS engine speaks text 
     a phrase, clause, or sentence at a time. In clause mode, speaking 
     starts when the DECtalk TTS engine is sent a clause terminator 
     (period, comma, exclamation point, question mark, semi-colon, or colon) 
     followed by a space. 

clause terminator
     A symbol used to begin and terminate a clause boundary. Symbols 
     can be periods, commas, exclamation points, question marks, semi-colons, 
     or colons. Each of these symbols must be followed by a space.

comma pause
     The pause DECtalk Software takes in speaking that is equivalent to 
     inserting a comma in a sentence. Comma pause can be increased and 
     decreased with the Comma Pause command.

.dic file
     The loadable dictionary file created by the User Dictionary Build Tool from a 
     .tab source file.

emphatic stress
     The emphasis placed on a syllable of a word to give it more meaning.

falling intonation 
     A decrease in voice pitch.

flush
     Process by which the Text-To-Speech system discards data in the system.

heuristic
     A method or rule used to decide among several courses of action. Often called 
     a "rule of thumb." In the case of DECtalk, pronunciation 
     heuristics govern the manner in which the DECtalk TTS 
     engine pronounces words.

homograph
     A pair of words that have the same spelling but which are 
     pronounced differently, depending on which syllable is accented.  
     For example, the pronunciation of permit as a noun and the pronunciation 
     of permit as a verb.

index marker 
     A marker placed in the text stream to indicate how much text
     has been spoken.

intonation
     The manner in which a voice imparts extra meaning to speech by adjusting 
     sound duration and voice pitch. For example, the emphasis and 
     meaning of the sentence, Bill, put in the edits. can be changed 
     by putting stronger emphasis on the name, Bill.  Bill! Put in the edits!

letter mode
     The state in which the DECtalk TTS engine speaks each letter 
     as it is queued. In word and letter mode, the DECtalk TTS engine 
     does not need to wait for a clause terminator to begin speaking. 
     This command interacts with the rate selection command so that you 
     can set both rate selection and letter mode for optimal output.

morpheme
     The minimum syntactic unit of a language that has an important role 
     in determining pronunciations. For example, spell has only one morpheme, 
     while misspelling is made up of three: mis, spell, and ing. 

period pause
     The pause the DECtalk TTS engine inserts when it finds a period that 
     marks the end of the sentence. This pause imitates 
     humans taking a breath.  This pause is approximately half a second.

phoneme
     The smallest unit of speech that distinguishes one word from another. Phonemes 
     are divided into vowel and consonant phonemes. the DECtalk TTS 
     engine interprets text within brackets as phonemes only after the phoneme 
     arpabet command is used.

phoneme arpabet command
     A command that causes all text within brackets to be treated as phonemic text.

phoneme string
     Two or more phonemes together used to pronounce a special word or 
     group of words.

phonemicize 
     To encode words as strings of phonemes.

phonemic mode   
     A mode that the DECtalk TTS engine uses for speaking phoneme strings.

phonemic transcription
     A word written the way it is pronounced is said to be in phonemic 
     transcription or simply in phonemics. When DECtalk says a word or 
     phrase not as you intended, you might need to use phonemic transcription 
     to get the desired pronunciation.  For example,  [r ' ehd ] is the 
     phonemic transcription of the word read.

phrase boundary 
     A clause boundary formed by terminating punctuation (comma, period, 
     exclamation point, question mark, semi-colon, colon) followed by a space.

pitch control symbols   
     Symbols used to override built-in DECtalk pitch control. Symbols include 
     pitch rise [/], pitch fall [\], and pitch rise and fall [/\].

primary stress
     Most content words of English (nouns, verbs, adjectives, and adverbs) 
     contain one primary stressed syllable. The primary stress symbol in DECtalk 
     is the apostrophe [ ' ].

proper name
     First names, last names, street names, company names, and place names are 
     all examples of proper names.

secondary stress
     A symbol used to indicate a degree of stress that is between primary 
     and unstressed (no stress). The secondary stress symbol is the 
     grave accent [`].

silence phonemes
     Silences of specified durations inserted into text strings in the same 
     manner as you would insert a phoneme.

syntactic function words
     A set of words that are either unstressed or have secondary stress. 
     They include prepositions, conjunctions, determiners, auxiliary 
     verbs, pronouns, the question mark, and clause introducers. 
     The DECtalk TTS engine uses stress and syntactic symbols to control 
     aspects of rhythm, stress, and intonation patterns. These symbols 
     include punctuation marks such as commas, periods, question marks, 
     and exclamation points.

.tab file
     The source file used to build a user dictionary.        

user dictionary 
     The dictionary that you create for the DECtalk TTS engine to load and use 
     with an application to control the pronunciation of specific words 
     processed by the application.

user dictionary builder 
     An program to build and compile user dictionaries.

voice-control command
     A DECtalk in-line command inserted into text strings and used to control 
     basic and special Text-To-Speech attributes, such as speaking voice 
     and speaking rate.

word boundary 
     A white space character (space, tab, or carriage return) in the 
     text that indicates a boundary between words. The DECtalk TTS engine 
     uses word boundary symbols to select the word-beginning or 
     word-ending allophone of a phoneme.

word mode
     A text-processing mode in which DECtalk speaks one word at a time. A 
     blank space or equivalent after a character or string of characters 
     causes that string to be spoken in word mode.


Copyright (c) 2000-2003 by Axsol Inc.
Copyright (c) 2001-2003 by Fonix  Corporation.  
All rights reserved.

The Axsol logo and Access Solutions are trademarks of Axsol Inc.
The Fonix logo and DECtalk are trademarks of Fonix Corporation Inc.

