From Sad to Glad: Emotional Computer Voices

April 1, 1988

Janet E. Cahn


Synthesized English speech is readily distinguished from human speech on the basis of inappropriate intonation and insucient expressiveness. This is a drawback for conversational computer systems. Intonation is the carrier of emphasis or de-emphasis, serving to clarify meaning for the spoken word much as variations in typeface and punctuation do for the written word. Expressiveness is not tied to word or phrase meaning but is global in scope. It provides the context in which the intonation occurs, and reveals the speaker's intentions and general mental state. In synthesized speech, intonation makes the message easier to understand; enhanced expressiveness contributes to dramatic e ect, making the message easier to listen to.

Related Content