5. Words, words, words 03/21/2012

Perhaps the greatest difference among languages and even speakers of the same language lies in the lexicon.  Indeed,  one definition of the lexicon is the location of all of the idiosyncratic aspects of language.


"The creative powers of English morphology are pathetic  compared to what we find in other languages. 127....In effect, Kivunjo (Bantu) and languages like it are building an entire sentence inside a single complex word, the verb. 128"


BUT recall our discussion of "STOP!"  It is both word and sentence.


Just a lot simpler than "Naikimlyiia" meaning in the Bantu language, Kivunjo "he is eating it for her" 127


Let's try to unpack this Kivunjo "word".


N- indicates the word is the focus of that point in the conversation


-a-:subject agreement marker (class one of 16 genders, human singular. Other genders include nouns of several humans, thin objects, clusters of objects, instruments, animals, body parts(7), small cute things, abstract qualities, locations,...)


-i-: present tense (vs earlier today, yesterday, remote past, habitually, hypothetically...)


-ki-:object agreement marker, indicating thing eaten is class 7.


-m-:benefit marker, act is for the benefit of class 1.


-lyi-:the verb "to eat"


-i-:verb's role players increased by one- the benefactive (compare ind obj in English)


-a-:indicative vs subjunctive mood.


These languages build the entire sentence into a "word."

Other languages, Turkish and Eskimo (seen in video Human Language 1) also pack a lot into a word.



morphological creativity

"The engineering trick behind human language--its being a discrete combinatorial system--is used in at least two different places: sentences and phrases are built out of words by the rules of syntax, and the words themselves are built out of smaller bits by another set of rules, the rules of morphology. 127"


Both words and phrases can have hierarchical structure.


Adj->"un"+Adj  (prefix+Adj)

Adj->verb stem+"able"  (stem+suffix)  for example  the complex word "unfixable" where the stem is "fix."

types of morphemes

While all words in a language share much in common, almost by definition we can assume they are different in form and meaning and that classes of words are distinctive in one or more ways from other classes.

bound and free morphemes

The morphemes like -ed, -s, that do not occur on their own are known as bound morphemes.  In contrast, nouns, verbs, adjectives, etc. are known as free morphemes since they can appear on their own without modification.

function (closed) and content (open) morpheme classes

“Function words” are known by all speakers of a particular language.  English has maybe a hundred or so.  While the odd person may not know common “content words” like dog, everyone knows the.  These content words are also known as the “open” class since new words enter every year and old ones drop out of use.


Inflections are bound morphemes that modify the form of a word.  Some languages, for example those with extensive case structure like Russian or Latin, rely more on inflections than does English--which has lost most of its Indo-European inflection heritage.


the last English cases--pronouns-who, which, (to, for) whom

she/he, hers/his, her/him , The IE inflections were lost as Germanic became English beginning about 800 AD. This gives us such things as "The cooking of the missionaries was terrible."


(Note that if 'missonaries' was in an object case, it could not be the subject NP of "cook".)

derivational rules (English examples p.128-9

Derivational morphemes

Inflectional morphemes

adjective and verb  prefix un

adjective suffix er, r

adverbializer ly

adjective suffix est, st

nominalizer er

noun suffix ie

noun prefix ex

noun suffix  s, es

verb prefix dis

noun suffix 's, '

verb prefix mis

verb suffix s, es

verb prefix out

verb suffix  ed, d

verb prefix over

verb suffix ing

verb prefix pre

verb suffix en

verb prefix pro

verb prefix re

Clitics (fragments attached to stems)

noun phrase post-clitic 'd

v:aux|would, v|have&PAST

noun phrase post-clitic 'll


noun phrase post-clitic 'm

v|be&1S, v:aux|be&1S

noun phrase post-clitic 're

v|be&PRES, v:aux|be&PRES

noun phrase post-clitic 's

v|be&3S, v:aux|be&3S

verbal post-clitic n't



morphological structure "rules"

N -> Nstem + Ninflection

"A noun can consist of a noun stem followed by a noun inflection. 131"  The root of a word is the smallest bit that cannot be cut into smaller parts.  Roots combine with inflections to form stems and stems can recursively form other stems..to an extent.


(Kiparsky noted cases like "mice-infested" but not "rats-infested" were probably due the fact that "mice" has its own lexical entry whereas "rats" is derived from "rat" + /pl/.


Other apparent m-rules exist but the meaning is not generally predictable from the structured parts of N roots and their inflections.136-7.

application of phonological rules

Note the commonality of formation of plurals, 3 person singular verbs, and possessives.

There are several ways to state this rule.  Clearly speakers have not memorized plurals since the same phonological processes occur in several inflectional situations including possessives and verb inflections.  As Morris Halle points  out, how else can we explain how English speakers correctly say "I like Johann Sebastian Bach's music" since there no English final [-ch].  Speakers do recognize it as unvoiced however.


Look at the final phonological segment (phoneme) of the morpheme to be inflected.  If that morpheme ends in a phoneme segment that is an affricate (e.g. fish, church, gauge) add [ez], ie. create a new syllable.  Otherwise add a short continuous fricative whose voicing is the same as that final phoneme onto the final syllable of the inflected morpheme.

(cat, dog, fish,--> cats, dogz, fishez)

mental dictionary

words as syntactic atoms

words as string of linguistic stuff: "listeme"

This coined word is used to indicate that lots of information about words and some phrases just must be memorized in the lexicon. 148.

level information: pronunciation, syntax, semantics

morpheme meaning as contribution ...

Each morpheme contributes something, often unique, to the meaning of its phrases and consequently its sentences. 

be careful about sense/reference of expressions

Morphemes themselves do not refer and their only sense is in regard to contribution to their phrases (expressiions)


possible words: phonology and the lexicon

The lexicon can be thought of as a repository for the possible words of a language as defined by the phonology.  Unlike combinations of numbers, not all combinations of phonological segments can be meaningful (MIller, 1990).   The actual number of words is much, much less than the possible number of combinations because certain combinations are ruled out by that phonology--so-called morpheme structure rules or phonotactic rules of a language.  In English, for example, "ftik" or "tsar" are not possible words despite our ability to articulate them; other languages may find them perfectly good forms. 

(note that similarities in phonotactic rules may be a major factor in APPARENT similarities across languages since phonetic values "mutate" within those structures.)

linguistic levels in the lexicon

Pinker, p.130 points out words are structured like sentences and cannot be generated by a chaining (Markoff) device.

N->Nstem + Ninflection  (N-> dog + -s)

Nstem->Nstem +Nstem (N->Yugoslavia report)

Nstem->Nroot +Nrootaffix (N->Darwin + -ian)

Pinker, p.133 notes how this structure can lead to ambiguity, e.g. blackboard vs black board and how one can "test" whether something like Yugoslavia report is a compound word or phrase.

See Burke et al (1990) for an example of "frisbee"

referent of the phrase (pragmatics)

sense or meaning contribution of words to their phrase

this is the contribution the entire word makes; it is determined combinatorially by its internal structure; a "sense" derived from the lexical entries for its "head" and other component morphemes.

in some cases root-affixe combinations have unpredictable meanings and stems entered meaning entered in the lexicon.


morphological "Syntactic" structure

syllable structure

phonological segment structure (phonemic structure)

phonetic structure

distinctive feature structure


acoustic structure



Put another way, the possible words in every language are redundant.  Choices in sequences of elements are not completely independent of each other as experienced Scrabble players know.  The redundancy of language manifests itself in various places including the structure of words.

actual number of words-individuals and languages?

empirical estimates

One estimate (Nagy and Anderson in Miller, 1990, p.138) puts the vocabulary of high school students at between 45,000 and 60,000 depending on exactly what words are involved.  There is of course considerable variability.

A self-rating estimate based on a sample of words carried out in my 712 class in 1993 averaged 45,000 words "used" and 91,000 words "known."


Figure x  Exponential growth of English dictionaries[1]


types of morphemes

open class (content words)

closed class (function words)

proper names

See Semenza and Zettin (1989) reading.

acquisition of vocabulary

rate is accelerating

(Compare to training apes to use symbols or signs.)



the "gavagai" problem

See Pinker and 2nd Human language video--W. Quine's problem.

other issues

the lexicon: cognition & perception


Whorf stuff- does word availability really affect cognition? (priming, categorization, functional fixedness)

lexical fields -collections of morphemes covering a "field" of a broad topic.

e.g. color names, size or dimensional adjectives (tall, wide..), verbs of motion..etc.

frequency and familiarity

age of acquisition

priming effects



associations among words

Word associations were perhaps first used by Francis Galton (1822-1911) as tap into the mind  ["They lay bare the foundations of a man's thought..in Miller, p.154].  He measured the time one word gave rise to another idea.  Carl Jung (1875-1961) tried to apply the idea to diagnosing psychiatric problems in a famous paper, relating it to some of Freud's ideas.  Much research has been done on the topic, from collecting norms to  building mental models generating these associations.

Jung, C. G. (1910). “The association method.” American Journal of Psychology 21: 219-269.

         Jung's 3 Clark lectures, Sept.,1909 translated from the German by Dr. A.A.Brill. CJ references German papers, 1906.


are derivations entered in our mental dictionary?

My guess is that the situation is comparable to multiplication – some of know 12 x 13 =?, others can figure it out using the standard multiplication rules.

words are special

word superiority effect

phoneme restoration effect

idioms (listemes)

Information must be in the lexicon that cannot be derived from the component elements.  Since phonetic elements carry neglible "phonetic symbolism," this means the minimal meaningful elements are morphemes.  Idiom, however, are word or phrase structures with their own unique meanings.


idiom translation-non lexical?

 ENG: That's nothing to write home about

 Swedish: "You wouldn't hang that on your Christmas tree"  (LINGUIST List: Vol-6-1006.)


[1]Data plotted from Miller, 1990, p.135.