4. How language works (03/03/11)

This is about human language structure--my "linguistic levels" of description. (See "Stop!" notes in Language video notes.)


"the way language works, then, is that each person’s brain contains a lexicon of words and the concepts they stand for (a mental dictionary) and a set of rules that combine the words to convey relationships among concepts (a mental grammar)."  Pinker, 85.

two "tricks" of language --not to be confused with Chomsky's "two fundamental facts...(creativity & rapid, uniform, untutored acquisition) "

1. arbitrary signs

Swiss linguist, Ferdinand de Saussure (1857-1913) considered by some to be a founder of modern linguistics pointed out the obvious, that meanings of words were arbitrary.  And that patterns of words created additional meanings.  This led to the notion of 'structuralism.'

2. grammar—'infinite use of finite means'

(Note the quote from W. von Humboldt!!  Recall discussion of Chomsky above.)

Wilhem von Humboldt (1767-1835)

'Language, I am fully convinced, must be regarded as directly inherent in human beings.'

At the same time, von Humboldt was an advocate of a mild case of linguistic relativity, perhaps based on the language research of his brother, Alex, in central America.  (He might be right! JL)

Wilhelm  von  Humboldt,  On Language:  The Diversity of Human Language-Structure and its Influence on the Mental Development of  Mankind.  Translated  by Peter Heath with an introduction by Hans Aarsleff.  (Cambridge: Cambridge University Press, 1988).


discrete combinatorial systems

"If a speaker is interrupted at a random point in a sentence, there are on average about ten different words that could be inserted to continue the sentence in a grammatical and meaningful way. 85"

...at least 1020 (a hundred million trillion sentences)


proof there's no longest sentence

"The number N of different strings of length L formed from a repertoire (an alphabet or vocabulary) of A alternative units is N = A to the L" 

Note that this is independent of any recursion; it is the number of possible signals of a given length possible with A elements. You can always add another word to make the longest longer.

(George Miller discusses much of this in the Human Language-1 video.)


All the above makes it hopelessly impossible to learn individual sentences like we learn individual words/morphemes.


independence from cognition

There are both strings that can be interpreted which are not English and strings that seem English but have no plausible interpretation. p.87-8 . (This is a complex issue with a range of possibilities.)


(As I (JL) have suggested in various places, this independence from cognition can be seen in aphasics like the Iowa lady and other aphasics.  It also may be part of the creativity of human minds in that grammatical relations may force unthinkable combinations of concepts.)

possible models for human language

finite state grammars,  "word chains", p.91—are wrong!


Even though they can generate an infinite number of sentences, they don't have the structure of human languages. You cannot, for example, define the subject of sentence in terms of specific words and word order--"The subject of any sentence is the N-th word." You need a "tree diagram"-- or parentheses to even begin...

1.probabilities not relevant

Learning sentences is not just learning word orders-what Word follows what Word.  These 'finite state' grammars are not good models for human languages.


Determining meaning requires determining grammatical relationships, e.g. what's the subject NP.  Who did what to whom?  This requires "structure dependent knowledge. (SDK) "


(Testing aphasics' language skills often require pitting their cognitive expectations against grammatically determined relationships, e.g. "The bird that the cat watched was hungry." Who was hungry? )


(One interesting function of language is to enable me to get you to imagine impossible scenes.  Grammatical relationships enable this imagination. See above.)


Subjects and objects of verbs cannot be defined in terms of word identity or word order.

2. an overriding plan for sentences --"trees"

Chomsky's embedded clauses, p.94 and, phrase structure, & recursion (phrases in phrases). (examples 97-99)


Human languages do not show their tree structure directly though it is coded into word order, inflection, syntactic rules and prosody.


(Compare computer languages and mathematics, e.g.  (2x(3+4))=14. vs (2x3+4)=10.

NP-> (det) A* N

'a noun phrase consists of an optional determiner (the, a, many..), followed by any number of adjectives, followed by a noun.' 98

linguistic levels of description

Language consists of several levels, with organized elements at each level. Exactly how these levels and elements are described and interrelated is an ongoing process of research -- usually with a number of different views. Pinker uses Chomsky's ideas from the the 1990s in The Language Instinct. Any future theory will have to deal with the same aspects of language, such as grammatical relations, ambiguity of structures, hierarchical structures, and recursion.

ambiguous strings of words

an utterance is ambiguous when there is more than one possible structural description (SD) that the grammar can "give" to that utterance.


  If there is an element at any level that can be described in more than one way, the utterance may be ambiguous. 

(Recall descriptive levels include acoustics, phonetics, phonology, morphology, words, phrases, clauses, semantics, pragmatics and reference.)

Examples test yourself- what element is ambiguous?

1. He slept on the bank.

2. She likes an ice cream (a nice scream, nice cream).

3. He likes old men and women.

4a. She saw her nephew sweating.

4b. She saw her nephew's wedding.

5. In the summer, biting flies can be disgusting.

6. The professor ordered the Dean to stop () drinking.


Why can’t you starve in the desert?

Chomsky's X-bar theory of Universal G (UG)

The idea is to formulate a theory of G that works equally well for any human L; hence a kind of definitive statement of what a possible human L is. (See Goals above)  In practice there are two main considerations; first abstract general principles from specific language descriptions.  At the same time, keep these UG principles "simple."

The X is to indicate a variable, where X can be V or N (and maybe prepositions and adjectives), in any language.


Needless to say, I hope, that these problems have not been solved and theories are evolving as people work on the problems.

parts of speech: nouns and verbs

relation between these & concepts

"There is a connection...but it is subtle and abstract..106 . More on this in the

Language acquisition chapter regarding "semantic bootstrapping"-- is there any truth to the idea that nouns refer to things and verbs actions?

phrase anatomy: the head and its "roles" 108-9

(There are four aspects of this anatomy -- the head, the role players, the modifiers, the specifier).

the phrase is about the "head"  (x-bar)

combinatorial semantics (sometimes "compositional semantics."

A working assumption is that elemental components of lexical meaning are syntactically organized and fused into an overall phrase meaning.

"What the entire phrase is "about" is what its head word is about." 107

phrases refer to the interaction of "role players"

verb and nounP arguments ("role-players")

"For example 'Sergey gave the documents to the spy' is not just about any old act of giving.  It choreographs three entities: Sergey (the giver), documents (the gift/given), and a spy (the recipient).

V (NP1, NP2, NP3...NPn) e.g. [gave (Sergey, documents, spy)] where each NP plays a specific role. In inflected languages, these NPs often are marked by a specific case identifying their role as agent, object, etc...

adjuncts (modifiers)

These elements are more distant from the head. See Pinker's example: "The senator from New York from Massachusetts.."

"(((The governor) of California) (from Illinois))"

NP->Nbar + PP

Nbar-> N + PP

Subjects ("Specifiers")

"The subject is a special role player, usually the causal agent if there is one."

The general blueprint for languages

"This streamlined version of phrase structure is called the "X-bar theory." 111

"It means that the super-rules suffice not only for all phrases in English but for all phrases in all languages with one modification…"

Languages differ in ordering of components of phrases

"Trees become mobiles." 111



"The piece of information that makes one language different from another is called a parameter. 111"

Learning a language is in part learning the parameters; the principles governing the structures are presumably innate, 112

The operation of words -- "despots" ,cases, & auxiliaries

The super phrase structure rules are only part of the language machinery.  The lexicon (mental dictionary) holds information about specific morphemes that work with the specific rules for a language.


Verb despots (Pinker's expression)

Verbs tell the rules what slots need to be filled and how to fill them. Here are some example types of information in the mental dictionary (lexicon).


(Meaning is expressed in English here but presumably in Mentalese in the brain.)


Means "to eat in refined setting"
Eater = subject


Means "to declare w/out proof"
Declarer = subject
Declaration = complement clause


Cases indicate roles NPs play in the sentences.  Indo-European languages (IE) typically overtly mark their NPs as to case (role), e.g. subject, object.... English has lost these except in a few pronouns (he, his, him). (See IE  geography).


Auxiliaries indicate truth of propositions implicit in the sentence

They appear at the head of special phrases called sentences and apply to the entire "tree".

(Red Sox)((will(NOT))(win Series)



Specifies the time



Negates the truth value


Necessity & possibility (might,, must, could...)

these modals moderate truth value of sentence e.g. ((Might) Sam has a full house.) )


Complex sentences –

Sentences with more than one clause are complex.  There are three major types- simple conjunctions, relative clauses, and complement clauses.  The last two are functionally different versions of conjunctions


"The fact that Otto knew was surprising." - why is this ambiguous?


pronouns, anaphors, and coreference ("Binding")

The term "binding" refers to the conditions interrelating NPs in a sentence (and sometimes beyond--though typically this is not seen as a grammatical matter.) That is, how are phrases connected to the world and each other. Sometimes the syntax does the work; sometimes it is contextual and cognitive.


This is part of the fundamental referential apparatus of human language.


Two NPs co-refer (are coreferents) if they both refer to the same entity.

Anaphors are NPs --often pronouns-- that somehow pick up reference with something else in the sentence.


Bill cut him.   (Bill ≠ him)

Bill cut himself.  (Bill = him)

Bill believes that Otto cut him. (Bill =him? or other? but ≠Otto)

Empty categories and traces

Transformations move phrases leaving traces and gaps ()

Comprehension requires listeners to "fill" the gaps if any.

NP and wh-traces

2. The professor ordered the Dean to stop () drinking.

6a The girl (who) Bill wanted () to leave () wore a pink dress.

6b. The girl (who) Bill wanted Joe to leave () wore a pink dress.


I want [pro] to visit him.

Why grammar? 125


"..mental software… refutation of  the empiricist doctrine that there is nothing in the mind that was not first in the senses…


Grammar is a protocol that has to interconnect the ear, the mouth, and the mind, three very different kinds of machine. It cannot be tailored to any of them but must have an abstract logic of its own."


Minimalist notes (Chomsky, 1992) Mostly in his own words.


"some basic properties of language are unusual among biological systems, notably the property of discrete infinity. p.2"

"The language is embedded in performance systems that enable its expressions to be used for articulating, interpreting, referring, inquiring, reflecting and other actions.  We can think of the SD (structural description) as a complex of instructions for these performance systems, providing information relevant to their functions.. The performance systems fall into two general types: articulatory-perceptual and conceptual-intentional....Two of the linguistic levels, then, are the interface levels A-P and C-I, providing the instructions for the articulatory-perceptual and conceptual-intentional systems, respectively.

"Another standard assumption is that a language consists of two components: a lexicon and a computational system.  The lexicon specifies the items that enter into the computational system, with their idiosyncratic properties.  The computational system uses these elements to generate derivations and SDs

"UG is concerned with the invariant principles of So (the initial state) and the range of possible variation.  Variation must be determined by what is "visible" to the child acquiring language (the primary linguistic data....Constructions such as verb phrase, relative clause, passive, etc., remain only as taxonomic artifacts, collections of phenomena explained through the interaction of the principles of UG, with the values of parameters fixed. p.5"


Final comment from Limber

(JL) The above implies that phrase structure itself is derived and that English rules like VP->aux +V+(NP) are not necessary parts of linguistic knowledge but instead follow from UG and the English lexicon and specific English "parameters."  For example generally if X is the head of X' (e.g.N head of NP), then for English X precedes its complements.