to reduce the ambiguity in human language, a major barrier to full machine understanding of speech, and to sense emotions and physi- ological states expressed in the human voice.As in facial recognition, the aim of sensing emotions and physiological states is being driven by the war on terrorism because it is important to detect stress in intercepted voice communications or “chatter.” Corporations are in- terested as well; they want to know when customers on the telephone are angry so that they can be mollified by appropriate responses (which might be hopeless if a customer’s anger was elicited by the frustration of talking to a machine with poor verbal skills).
But it is not easy to quantify exactly what it is we sense in prosody, or to put that knowledge into artificial speech systems. Reviewing how humans recognize emotion, Ralph Adolphs, a neurologist at the University of Iowa, says,
In general, recognizing emotions from prosody alone is more difficult than recognizing emotions from facial expressions. Certain emotions, such as disgust, can be recognized only very poorly from prosody.
Even human evaluators might disagree about how to classify the emo- tions expressed in a voice, especially for short utterances. Without a reliable database of classifications, it is difficult to determine exactly what a machine system should listen for to determine a person’s state of mind. But progress is being made in this relatively new area, espe- cially in its pragmatic aspects: for instance, it seems that when correct- ing an error made by an artificial speech system a human tends to hyperarticulate—that is, speak slower and louder, and at a higher pitch—a clue that is useful in helping the system to respond appropri- ately.
Today’s artificial speech systems show the level at which recogni- tion, synthesis, and conversational ability come together. Speech Ex- perts, a German firm, recently announced a washing machine that obeys voice commands.This might seem an odd choice for advanced speech capabilities, but a company spokesman claims that,“Electronic appliances have become so complicated . . . that consumers are put off by them. Speech recognition would help people.”The machine is said to be able to follow complex instructions, such as “Prewash, then hot