Speech
Speech is the continuous production of sound in a structured form to convey information. By continuous I mean that speech doesn't require a [see page 2, break] between utterances; a speech recording is a continuous signal.
Speech is the most sophisticated behaviour of the most complex organism in the known universe.
Richness
Speech is [see page 12, rich] in various kinds of information.
Information | Meaning | Example |
---|---|---|
Linguistic | What's being said (distinguishing meaning) | Ahead vs. A Head |
Para Linguistic | How it's being said emotion). | Emotion |
Extra Linguistic | Other speaker generated behaviour | Breathing |
Variability
Speech is also [see page 4, variable] meaning there're over 7,700,000,000 people in the world and they all speak differently.
Speech also varies based on dialects (different words to reference the same thing in the same language) and accents.
Variation | Description |
---|---|
Dialect | Different words reference the same thing in the same language (eg. "beck" = "stream", in old yorkshire). |
Accent | Different sounds are used. |
We [see page 6, divide] variations into:
Variation | Description |
---|---|
Inter-Speaker | Caused by age, gender, physical characteristics, etc. |
Intra-Speaker | Caused by physiological factors (eg. bad health), psychological factors (eg. mood) or external factors (eg. environment) |
[see page 9, Factors] that can affect your speech output:
Factor | Affect |
---|---|
Noise (static) | Hyper Articulation (The Lombard Effect). You speak much louder, clearer etc. You'll be putting in more effort. |
Vibration | Stresses to the chest, vocal tract and jaw greatly diminishes your ability to speak. |
The Task | Casual conversation vs. Reading out loud vs. Lecturing all have a different style of speaking. |
The Listener | Bah Bah Goo Gah. We speak differently depending on who we're talking to (eg. a baby). |
Cognitive Load | When overloaded people can find it hard to speak or listen because their focus is entirely elsewhere (eg. driving) |
Alcohol/Drugs | Impaired motor control can make it difficult to properly articulate. |
Consequences
Variability [see page 10, means] that signals we'd like to be the same are actually quite different and things which we'd like to be different can be very similar (ambiguity).
Disfluency
Normal spontaneous speech [see page 13, contains]:
- False starts
- Repeats
- Filled pauses
uhmms
- Non-linguistic utterances (NLUs)
- Overlaps
These so called disfluencies can make speech easier for a human to produce and understand.
Multimodality
Speech is not only acoustic, it's also [see page 14, visual]. Seeing a persons lips while they speak helps us tell what they're saying. Our visuals can even override acoustic information (the McGurk effect).
Contamination
Speech is [see page 15, contaminated] with countless noise making it harder to decipher acoustic signals.
Consonants are the part that make it easier to decipher censored words, but it's often the quiet part of noise meaning its affected by noise the most. Vowels are the loud part of speech but rarely as helpful as consonants.