This long note discusses many controversial issues! (JL)

From sarich@qal.Berkeley.EDU Fri Apr  1 06:24:38 1994
Date: Thu, 31 Mar 1994 16:03:35 -0800
From: Prof Vince Sarich 
Subject: Race and Language in Prehistory

Translation of funny letters R=open quote;S=close quote;Q=dash;U=apostrophe
quotes also enclosed by <   >.  I don't know how the table will come over,
but the idea is that the 5 #s following each fossil are the distances to
the 5 reference populations.  Comments of course greatly appreciated




I argue here that all the available data on Homo sapiens (molecular, 
morphological, linguistic, cultural) are most readily interpreted within the 
framework of a phylogenetic tree that links extant human populations over a 
time span of no more than the last 15,000 to 20,000 years.  This is not to 
suggest that some ur-population speaking an ur-language lived in a 
geographically restricted Garden of Eden 15,000 years ago, expanding out of 
there to lead to what we have today.  Instead, the scenario envisioned here 
goes to quite the other extreme in envisioning our "Garden of Eden" as the 
entire inhabited world of that period.   I suggest that as recently as perhaps 
15,000 years ago the human population was something very close to "panmictic" 
at all levels, and that most of the interpopulational differences we observe 
today, and in the recent past, have accumulated since then.  The proposed 
"panmixis" is seen as driven by the last of the glacial pulsations which would 
have necessitated recurrent large-scale movements of populations, not only in 
areas "directly" affected by the glaciers themselves, but also in those that 
suffered the secondary effects of shifting climatic zones and major sea level 
changes.  It thus must have been essentially world-wide, and only after 
populations began to settle down in more-or-less their current areas could 
regional differentiation leading have begun again.  Thus we would have had 
episodic, glacial cycle driven, regional (racial) differentiation subsequent to 
the expansion of Homo out of Africa, and concomitant episodic obliteration 
("panmixis") of most or all of the regionality.   We then simply appear to be 
living in one of those episodes of regional differentiation, with ours 
beginning with the last glacial retreat.  These episodes of developing 
regionality would have been characterized by differential retention of portions 
of the existing variation (which would have been, just as today, substantial Q 
but basically intrapopulational) plus in situ developments.  The degrees of 
past regionality achieved would, then, presumably, have been strongly 
correlated with the lengths of the glacial/interglacial cycles involved, and 
thus potentially much greater than that present today.
That is the model; what follows is its genesis, development, and testing.


Cavalli-Sforza, Piazza, Menozzi, and Mountain recently proposed (1988) that 
there is a general congruence between gene-based and language-based trees 
linking extant human populations. This proposal has engendered a great deal of 
controversy, in particular from a group of workers based at the Smithsonian 
(O'Grady  et al, 1989; Bateman  et al, 1990).  Cavalli-Sforza  et al have 
responded (1989, 1990).  It seems to me that, as is so often the case in 
situations of this sort, the two groups have been talking past one another; and 
that their dialogue has had the effect, as again is so often the case, of 
focusing our attention, and theirs, on secondary details, and diverting it from 
the basic issues involved.  What follows is an attempt to get at those basic 
issues.  The most important point to recognize is that this particular dialogue 
is not new.  The questions raised and addressed in it have long intrigued 
students of human evolution, and a substantial consensus as to many of the 
answers has been present for some time now.  First, it has been self-evident 
since long before we had any actual gene frequency data that populations can be 
closer (that is, share more recent common ancestry) linguistically than 
genetically (for example, Native American and European-derived Spanish speakers 
in the New World); and also that they can be closer genetically than 
linguistically (for example, speakers of Austronesian and Papuan languages in 
New Guinea).  But it has also been self-evident to most students of human 
evolution that such cases must be the exceptions rather than the rule, though 
the Smithsonian group apparently disagrees. This is evident in their more 
recent and comprehensive effort (Bateman et al, 1990; especially pp 8-11), and 
at the end of their letter to Science (O'Grady et al, 1989), where they state 

The response to this statement has to be, "What evidence?"  The Smithsonian 
group presents no documentary support for their position, and, indeed, it is 
difficult to imagine how there could be any. That is, given that any 
differentiation among populations (genetic, linguistic, cultural) implies 
actual physical separation among them, there is going to have to be an 
appreciable congruence among the pattern of relationships implied by each 
variable.  Even today two populations more similar anatomically to one another 
than either is to a third are also much more likely to more similar 
genetically, culturally, and linguistically, and this would have even more 
often been the case in the past.  Thus the position of Cavalli-Sforza et al Q 
that there is a general isomorphy of gene and language trees Q must be seen as 
something close to a null hypothesis; that is, as being much closer to being a 
reasonable working assumption than a data-based conclusion.  This has long been 
evident.  One of the greatest of all students of our species, A L Kroeber, 
pointed out many years ago that: 

But the general isomorphy between language and genetic trees noted by 
Cavalli-Sforza et al has long been apparent, implying that "the two move 
together" far more often than they move separately Q a point Kroeber was quite 
clear upon: 

This aspect of the Smithsonian group's argument would then have to be regarded 
as untenable as well as undocumented, and any attempt to validate it as very 
likely an exercise in futility.  But such a judgment tends to deflect attention 
from what is very likely the real source of these objections.  What has passed 
apparently unnoticed about the Cavalli-Sforza et al  scenario is that it 
requires an enormous, and completely undocumented, linguistic extrapolation Q 
from the roughly 7,000 years or so over which linguists tend to agree that 
language relationships can be traced, to the 100,000 or so years which dates 
the root of their tree. While the extrapolation is enormous and undocumented 
(and, very likely, undocumentable even if correct), it does at first glance 
seem to be required by the apparent congruence of the gene and language trees 
in the context of the dates provided by the molecular, paleontological, and 
archeological evidence.

It is at this point that the fairly narrow dispute just introduced can be put 
into a much broader and more relevant context.  We, professionals and lay 
public alike, have a fascination with our past.  Among other things, we want to 
know when, where, and under what conditions, people like ourselves first 
appeared, and we want to know about the origins, history, extent, and 
functional significance of racial (or, if a less loaded term is desired, 
regional) variation within our species.  Recent years have seen a marked 
resurgence of professional interest in these matters (e g, Smith and Spencer, 
1984; Mellars and Stringer, 1989; and many others).  That interest has spawned 
a number of interesting and influential hypotheses and scenarios, two of which 
will be addressed here.

First, there is a developing consensus as to the coexistence of two distinct 
Homo sapiens lineages ("Neandertals" and "anatomically modern Homo sapiens") 
from sometime before 100,000, to about 30,000 years ago (see Mellars and 
Stringer, 1989 for a recent survey of the evidence and arguments).  The main 
problem with this scenario is that the evidence supporting it is entirely 
morphological; that is, there is general, though hardly unanimous, agreement 
that the human fossil remains of the period sort reasonably cleanly into two 
groups.  I do not with to raise issue with that judgment at this point.   My 
problem is with the fact that the two proposed lineages have left no record 
whatever of any cultural distinctiveness.  We cannot identify stone-tool 
assemblages as deriving from "Neandertal", as compared to "anatomically modern 
Homo sapiens", occupations.  Thus we are being asked to envision two 
populational (in effect, racial) lineages of one human species sharing the same 
stone tool cultures in the same areas Q and yet maintaining their genetic 
identities for more than 70,000 years.  I suggest that Q absent the cultural 
element Q we could not take such a scenario seriously for any other species, 
and adding in the cultural element for our own species makes the scenario even 
less plausible.  This argument is hardly novel, and yet it has clearly not been 
nearly enough; remaining, in the main, unaddressed and unrefuted.  The
(1)     presence of anatomically modern humans in sites currently dated to 
some 100,000 years ago in both   South Africa (Klasies River) and the Near East 

 (2)    prior to the appearance of "classic" Neandertals in Europe tens of 
millenia later

 (3)     along with the disappearance of the latter by 35,000 years ago, and

(4)     the presence only of the former since

 has made it impossible for most students not to see the presence of two 
distinct genetic lineages Q one connecting pre- and post-Neandertal 
anatomically modern humans, and the other composed of the various Neandertals.  
The fact of a total lack of anatomically modern humans contemporaneous with the 
latter has basically been ignored Q a manifestation, perhaps, of the "absence 
of evidence/ evidence of absence" problem.

The minority who would tend to accept the doubts raised in the preceding 
paragraph, and also to see greater anatomical continuity between Neandertals 
and anatomically modern humans, envision a quite different scenario.  In it 
(generally referred to as the "regional continuity" model) Neandertals simply 
represent a phase connecting Homo erectus and modern Homo sapiens, so that 
there are appreciable, and somewhat separate, genetic continuities between 
non-sapiens and sapiens in different areas of the world.  In this scenario, 
then, modern Asians, for example, are more similar genetically to Asian erectus 
than to African erectus.  I have long been an adherent of this view (Sarich, 
1971).  I no longer am.

The reasons for my rejection of the "regional continuity" model are two-fold.  
First, it seems to me that the mitochondrial DNA data make it untenable.  If 
regional continuity were a fact, then we should expect to see ancient, and 
region-specific, mitochondrial lineages (clades) in several areas of the world. 
 But, in fact, this situation characterizes only sub-Saharan Africa (and even 
that is currently in doubt), and not Europe, nor the Far East, nor 
Australia/New Guinea (Cann, Stoneking, and Wilson, 1987).     

Second, the regional continuity model would predict that the morphological 
distances separating the most distinct modern Homo sapiens populations from one 
another should be greater than those separating the latter from their 
"Neandertal" precursors.  In other words, the morphological distance between, 
for example, Europeans and native Australians, should be greater than between 
Europeans and European Neandertals.  But this is by no means the case.  Using 
the size-corrected morphological distance metric discussed below, we find that 
in fact the latter distances are about twice the former.  Howells (1973, 1989), 
using Mahalanobis distances, has obtained similar results. 

This rejection of the regional continuity model should not lead to acceptance, 
by default, of the only apparent alternative.

I have come to see that both of the above scenarios are fundamentally flawed, 
and that reality lies outside the range of possibilities they represent.   

I suggest here that we need to look again at the evidence, and argue, as noted 
in the introductory paragraph, that this second look tells us that all the 
available data (molecular, morphological, linguistic, and cultural) are far 
more readily interpretable within the framework of a tree that links extant 
human populations over a time span no greater than 15,000 to 20,000 years, and 
certainly not the 100,000 or so which characterizes so many recent discussions, 
including that of Cavalli-Sforza et al.


The only direct evidence as to the antiquity of populational lineages resides 
in the fossil record, and thus I here ask it to tell us when individuals found 
in a given area begin to have a reasonable probability of being more similar to 
modern individuals found there than to those found in other areas.  We might, 
for example, attempt to place early Upper Paleolithic European-area fossils on 
a tree of extant human populations derived from the data of Howells (1973, 
1989).  One would expect that if the 100,000 year time scale of Cavalli-Sforza 
et al were correct, then these 12,000 - 30,000+ year-old Europeans should fall 
on the European clade of the Homo sapiens tree, and the literature tends to 
imply that this is actually the case; in other words, that these are 
anatomically, as well as geographically, European Q that is, more closely 
related to modern Europeans than to modern Asians or modern Africans.  That, as 
will be demonstrated, is simply not the case, though the demonstration is 
neither easy nor straightforward.

It would appear that at least three criteria must be satisfied before one can 
address questions of this sort with any degree of justified confidence.  First, 
the algorithm to be used must be able to place known individuals into their 
appropriate populations or areas with some reasonable degree of reliability.  
This is obvious, for if it doesn't work reliably for knowns, there can be no 
rationale for using it on unknowns.  Second, it should be able to take a random 
sample of individuals from known groups and reconstitute those groups without 
previous knowledge of their characteristics. In other words, our algorithm 
should have a reasonable robustness with respect to assessing the affinities of 
individuals when compared to other individuals, and not simply to known 
populations.  This is necessary because human fossils are almost always found 
as individuals, and obviously do not belong to extant populations.  This latter 
point leads to the third, and more subtle, requirement.  If a fossil cannot be 
viewed as a member of an extant population, then it can only be tested for 
placement on, or proximity to, a lineage leading to one or more of the latter.  
But this means that a simple similarity criterion will not do, as showing that 
fossil X is "most similar" to extant population Y (or to individuals from Y) is 
without phylogenetic significance until the question of amounts of change along 
the various extant lineages is taken into account.  That is, fossil X may be 
more similar to population Y than to population Z simply because less change 
has taken place along the Y lineage, and not because X was part of the Y 
lineage more recently than it was part of the Z lineage.  This last problem is 
one that does not appear to have ever been recognized, never mind addressed, in 
the relevant literature; nor can it be said that the other two requirements 
have generally been adequately addressed or met in that literature.

Any such effort today necessarily begins with the unique and invaluable body 
of data gathered by Howells (1973, 1989), which he has generously made 
available on disk.  Thus interested workers can now work directly from a large 
number of individual measurements made by one person on more than 2,000 skulls 
from, in the main, known populations.  I have developed the following approach 
for satisfying the three criteria just noted.  I make no claim here that this 
is the best possible "algorithm", nor am I especially satisfied with its 
elegance.  It does, however, have the virtue of working; that is, of satisfying 
those 3 criteria.

There would appear to be at least 4 basic considerations here.  First, which, 
and how many, characters are to form the basic data set ?  Second, what 
size-correction procedure is to be used?  Third, How are the size-corrected 
data to be compared?  Fourth, how are affinities to be judged?  Now there is 
obviously a voluminous, and often highly contentious, literature addressing 
each of these matters, and I will here only provide a brief justification for 
the choices I have made.

The  measurements used here [#s 2, 4, 5, 6, 8, 9, 11, 13, 14, 17, 18, 22, 23, 
24, 25, 32, 33, 35, 37, 39, 40, 41, 42, 43, 44, 45, 46, 47, 51, 54  from 
Howells, 1973] were chosen as (1) being the most discriminating among modern 
individuals and populations, (2) giving substantial attention to the face as 
well as the cranium, (3) less likely to be missing in the fossil specimens, and 
(4) more objectively measurable.   Size-correction was achieved by dividing the 
breadth measures (#s 5, 6, 8, 9, 11, 17, 18, 22, 24) by the mean of cranial 
length and height; the length measures (#s 2, 23, 25, 32, 33, 51, 54) by the 
mean of cranial height and breadth; the height measures (#s 4, 13, 14, 35) by 
the mean of cranial length and breadth) and the cranial vault measures (#s 
39-47) by cranial length. Then the variation in the contributions to be made by 
the various measures to the overall distance was reduced by converting each 
size-corrected measurement to a z-score (using a panel of 105 recent specimens 
drawn from 21 populations to provide the reference means and standard 
deviations). The distances (inter-individual, individual to group, or 
inter-group) were then calculated as the average z-score per size-corrected 
measurement. For this study, all 30 measurements, treated in exactly the same 
manner, with each contributing equally to the final result, were used for all 
comparisons (except, of course, when some were missing for certain of the 

The next step here is to gain some measure of affinity from these 
size-corrected and range-adjusted data. The major factor mitigating against 
simply using these distances directly to reliably link single individuals with 
their appropriate known populations (criterion 1), or for the more difficult 
task of linking individuals from the same population when one doesn't know the 
population (criterion 2), is the fact that a morphologically "primitive" 
population; that is, one closer to the base of the Homo sapiens tree, would 
show greater affinities on the basis of shared primitive conditions, and such a 
result would have no phylogenetic relevance (criterion 3).  The importance of 
this matter can be assessed by calculating the distances among Howells' 1989 
sample of 29 human populations (plus the Tierra del Fuegans I measured in 
London and Vienna) and then seeing if any of them shows a significantly smaller 
average distance to the others, thus indicating much less change along its 
lineage.  We find that the Norse, Zalavar, Hokkaido, and Ainu samples are, by 
this criterion, least derived, showing about 25-35% less change than the 
average.  The Buriats (because of their combination of extreme cranial and 
facial breadth, and facial length) show far and away the greatest amount of 
change (some 75% above the average), with the Eskimo and South Australians next 
(about 30% above the average).  This, given the lack of a tree-drawing program 
that does not assume anything about rates of change and can still deal with a 
large number of individuals, tells us quite clearly that rate-correction prior 
to calculation of affinities is necessary.

The simplest such procedure, given a matrix of z- scores among the units in 
the sample, is to calculate the correlation coefficients among the columns of 
z-scores.  Slightly more resolving power is sometimes obtained Q in the 
situation where one is comparing against known populations Q by converting the 
z-scores to distances, and rate-adjusting each column of distances (that is, 
for example, reducing the Buriat distances and increasing the Norse ones). This 
addresses criterion 3.        
 These simple approaches seem to provide remarkably robust results.  I set up 
a test sample of 50 individuals [10 from each of 5 widely-separated populations 
(Norse; Zulu; Tolai from New Britain; Anyang of Bronze Age China; Santa Cruz 
Island, California)] in Howells' sample, the 30 populations just noted, and 33 
fossils.  Criterion 1 was addressed using each of the approaches outlined above 
to compare the 50 individuals with their population means.  Simply using the 
correlation coefficients among the columns of z-scores placed 44 of the 50 
individuals into their correct population.  The 6 misplacements were: one Norse 
marginally with the Zulu and another with the Tolai, one Zulu with the Tolai 
and another with the Santa Cruz, and 2 Santa Cruz with the Anyang.  Using 
distances corrected for amount of change differences among the populations 
involved gave 2 marginal misplacements: the same 2 Santa Cruz were closer to 
the Anyang than the Santa Cruz by ~0.08 SD; and 3 no-decisions: one Norse was 
equidistant between Norse and Zulu, one Zulu between Zulu and Tolai, and 
another between Zulu and Santa Cruz.  These results would appear to satisfy 
criterion 1 at least as well as might be desired.

Criterion 2, involving the inherently more difficult task of forming the 
populations without prior knowledge of their characteristics, was addressed as 
follows.  First, the same 50 individuals were used, but the z-scores were 
calculated internally.  Correlation coefficients were then calculated among the 
50 columns of z-scores.  Simple inspection gave 5 obvious units containing 37 
of the 50 individuals.  Means for each of those 5 units were then calculated 
and the remaining 13 individuals compared against them.  A further 8 then fell 
into their correct units, 2 fell into a wrong unit (one Anyang with the Santa 
Cruz, and one Tolai with the Zulu), and 3 straddled 2 units (one Norse between 
Norse and Santa Cruz, and 2 Zulu between Zulu and Tolai).  Next, the same 
procedure was followed using the first 25 individuals in each of two of 
Howells' Amerind populations, the Arikara and Santa Cruz.  This time 3 Arikara 
fell with the Santa Cruz, and 3 Santa Cruz with the Arikara.   Again, these 
results would appear to be more than satisfactory.      
 The fossil individuals (listed in Table 1) were then tested in similar 
When Cromagnon through Taforalt were compared against the panel of 50 recent 
individuals, 4-5 were closest to the Norse, 6-7 to the Santa Cruz, 2-3 to the 
Zulu, 2 to the Tolai, and 1 to the Anyang.  Qafzeh 6 was most similar to the 
Santa Cruz, Qafzeh 9 to the Zulu, and Cohuna and Upper Cave 101 to the Tolai.   

When compared against the 5 modern populations using 
amount-of-change-corrected distances, only 2 (CroMagnon and Candide 1) were 
closest to the Norse, and both associations were marginal.  3 others straddled 
the Norse and one of the other 4 (Mladec with the Zulu, Candide 5 with the 
Tolai, and Afalou 10 with the Santa Cruz).  4 fell with the Tolai (Kostenki 14, 
the Oberkassel male, and the 2 Taforalt).  4 fell with the Santa Cruz (Pataud, 
Oberkassel female, one of the Candide, Afalou 32), and Chancelade and Afalou 9 
went strongly with the Anyang.  Predmost straddled the Tolai and Santa Cruz, 
and Afalou 29 was effectively equidistant from all 5.

 The conclusion, then, is simple: Upper Paleolithic European-area fossils do 
not show any marked tendency to "look European".

This finding is reinforced by asking as to the affinities shown by the fossils 
to one another.  The answer is that these are very similar in degree to those 
shown in the panel of 50 modern individuals when comparing individuals in one 
population to those in the other 4; that is, for example, any of the Zulu to 
the 40 non-Zulu.  The only real, and not unexpected, exception to this pattern 
is that the 3 Candide individuals tended to associate relative to most of the 
It must be noted now that much of this is simply putting a quantitative gloss 
on judgments made long ago on a "look-see" basis.  Various of the European-area 
fossils have long been seen as "Eskimoid", or "Negroid", or "Australoid" Q 
particularly when race was a far more important variable than it is today, and 
when the notion of "pure races" was still more or less viable.  But that 
scenario would have proto-Africans, proto-Asians, proto-Europeans, and 
proto-Melanesians, all co-existing as distinct populations some 15, 000 to 
35,000 years ago in one tiny corner of the world during the height of the last 
glacial.  No modern scholar could seriously entertain such a view, but its 
rejection explains neither the degree of cranial and facial variation present 
nor the apparent affinities with diverse modern populations.

What does, I argue, is the scenario outlined in the opening paragraph of this 
article.  Obviously the major feature of that scenario is that the regional 
("racial") differentiation which has resulted in the varieties of humans 
populating the Earth today must be very recent; that it is, in the main, the 
result of changes which have occurred over the last 10-15 000 years. This would 
suggest that the human face and cranium are remarkably plastic with respect, 
one supposes, to local conditions Q thus we can get the remarkable, presumably 
convergent, similarities between the Moriori (from the Chatham Islands just 
east of New Zealand) and the Arikara, as well as those between Peruvians and 
Europeans; as well as very rapid differentiation.  For example, the most 
distinct (in a "primitive" direction) of all recent Homo sapiens were, almost 
certainly, the Ona (Selk'nam) of Tierra del Fuego (Gusinde, 1939).  I recently 
had the opportunity to obtain measurements on some 30 putatively Ona and 
Alakaluf specimens in London and Vienna, and the resulting distances between 
them and the Arikara and Santa Cruz can be larger than those between, for 
example, some European and African populations.  This then gives one some sense 
of how rapidly the evolutionary process one can produce "interracial" distances 
among human populations, given that the Tierra del Fugean to Arikara/Santa Cruz 
distances have arisen over less than (possibly much less than) 10,000 years; 
thus there is no reason to require much more time to produce the total range of 
cranial and facial variation we see today within our species. 

Finally, in the fossil realm, it might be appropriate to consider what light 
these exercises shed on the "Neandertal question".  In the current consensus, 
as already noted, the Neandertals are seen as part of a lineage separate from 
that to which all anatomically modern Homo sapiens, from Qafzeh onward, belong. 
 The basic reason for this is that they are judged as "too different", a 
judgment only rarely supported by actual data.  Now it is true that, using the 
metric described above, one does find the morphological distances between the 
Neandertals and ourselves to be substantially greater (actually, they are, on 
the average, about twice as large) than those separating most pairs of extant 
human populations.  But while true, it may not be especially germane in light 
of the fact that it tends to ignore both the amounts of time available to 
produce the observed differentiation and the range of morphological distances 
separating extant human populations.  First, it has to be noted that the 
various Neandertal specimens compared here are by no means anywhere near to 
equally distant from the various extant populations in our comparison sample. 
They (except for Shanidar which is pretty much equidistant among Norse, Zulu, 
and Santa Cruz), Skhul 5 and Irhoud 1, as well as earlier specimens like Kabwe, 
Petralona, and Steinheim, are much more similar (the mean difference is about 
0.5 SD; e.g., Ferrassie to Norse = 1.64 SD; to Santa Cruz = 1.23 SD) to, in 
particular, the Santa Cruz and Tierra del Fuegan populations than to other 
moderns. In addition, it is quite possible to exceed Neandertal-modern 
distances within the modern sample Q and this is without appealing to 
exceptional individuals.  That this point has not really been made previously 
is likely due to the fact that the largest modern differences do not involve 
"more primitive" populations; that is, ones who are more like the Neandertals, 
but in fact are between the Buriats and Teita, where the mean pair-wise 
non-rate-corrected distance between 54 Buriats and 32 Teita is 1.47 SD.  This 
compares to 1.26, 1.32, and 1.50 between Ferrassie and, respectively, 15 
Moriori, 24 Tierra del Fuegans, and 23 Norse.  Given figures such as these, it 
is difficult to see why the Neandertals can't be seen as just another regional 
variant ("race") of Homo sapiens, and why the term "anatomically modern Homo 
sapiens" should be retained at all.  These numbers would suggest that there is 
no better reason for excluding the Neandertals from "anatomically modern Homo 
sapiens" than there would be for doing the same to the Buriats. 

A similar scenario would also appear to apply in the linguistic realm, but to 
see it we first need to challenge the extremely conservative current consensus 
among most linguists that relationships among languages that diverged more than 
perhaps 7,000 - 8,000 years ago are, at present, unknowable.  A simple exercise 
suffices here to show that this consensus is unreasonably pessimistic.  One 
simply sits down with, for example, Buck's A Dictionary of Selected Synonyms in 
the Principal Indo-European Languages, a basic word list, and some independent 
knowledge of two or more languages representing distinct Indo-European groups.  
I used English and Croatian, representing, respectively, its Germanic and 
Slavic branches.  If one then asks what proportion of the words in modern 
Croatian appear, simply by inspection (but allowing for some phonetic and 
semantic drift), to be cognate with the reconstructed Proto-Indo-European (PIE) 
form (or, where that is unavailable, with the English word), one gets a minimum 
figure of about 60%.  For example, snow, snjeg, *sneigwh;  many, mnogo, 
*monogho; blood, krv, *kru; tree/wood, drvo, *dru;  earth, zemlja, *ghem.  
Similar results were obtained using native speakers of Spanish and Bengali, and 
for Armenian and Albanian using Decsy's The Indo-European Protolanguage: a 
Computational Reconstruction. Thus 60% survival seems to be a reasonably 
representative figure for the survival of PIE roots with meanings in extant 
Indo-European languages.

Now obviously some number of these matches will be coincidental (though that 
number will likely be small, as illustrated by the fact that Chinese, by the 
same test, will show less than 10% apparent "cognacy" with PIE, English, or 
Croatian Q I am indebted to Dr W S-Y Wang for this comparison), but, by the 
same token, some will be missed when the degree of phonetic or semantic change 
makes cognacy less than obvious.  For example Q foot, noga, *ped Q where one 
might miss the English correspondence because of the phonetic changes, and 
would (and, perhaps, should) certainly miss the Croatian unless one remembered 
that "pod" in Croatian means under, and that an association between "under" and 
"foot" is perfectly reasonable. This would imply a cognacy loss of less than 
10% per millenium along a lineage, implying that even at a time depth of 12,000 
- 14,000 years; that is, twice the probable time which separates modern 
Croatian from its Proto-Indo-European ancestor, one might retain 30% or so 
phonetic/semantic cognacy.  Thus one could recognize relationships among 
languages whose common ancestor lay that far in the past provided one looked at 
a sufficient number of them, and avoided simple binary comparisons.  That is, 
if each of two descendant languages retains 30% cognacy with the ancestral 
language, they will, on the average, share only 9% [(0.3)2] with one another Q 
and this gets into the chance area of similarity.  On the other hand, if you 
look at 10 such languages, three, on the average, will retain a particular 
cognate Q greatly increasing your chances of recognizing relationships among 
them, and of reconstructing the ancestral form.  This is the procedure and 
argument of Greenberg [(1987) Q see also discussion in Ruhlen (1987)], and, 
whatever the questions that might be raised about certain details, there can be 
no doubt the current general consensus among most linguists that relationships 
among languages older than about 7,000 years are, at present, unknowable, is 
unrealistically and unreasonably pessimistic and conservative.

This frustrating unreasonableness (more fairly, obstinacy) is currently best 
illustrated by the controversy surrounding Greenberg's 1987 proposal that all 
New World languages other than those belonging to the Na-Dene and Eskimo/Aleut 
families have a single origin in proto-Amerind.  The degree of venom this 
proposal has generated is best exemplified in the title of Matisoff's 1990 
Language "discussion note": On megalocomparison.  The reference is then 
obviously to the supposed megalomania implicit in Greenberg's proposal; and any 
doubt as to Matisoff's meaning is removed in footnote 5 on pg 108, where he 


The "note" then further degenerates over the next dozen pages.  But Matisoff's 
comments are notable only for their level of petty sophomoric bitchiness, and 
are really beneath contempt; the message itself is representative. Yet the 
logical content of the various published objections is essentially nil.  The 
main one seems to be that Greenberg has gone about it backwards; that is, 
instead of proceeding from lower to higher level groupings, he has simply 
carried out multilateral comparisons, asking whether a particular gloss has 
sufficiently similar reflexes in languages taken from two or more major 
groupings.  But the response here is that Greenberg is doing precisely what was 
done to achieve the recognition of every other language taxon, including 
Indo-European itself.  This is an indisputable fact, and therefore any 
suggestion that Greenberg, or anyone else, should go about it in the reverse 
direction is totally irrational.

Next, and far more important, is the fact that the marked similarities among 
Amerindian languages which led to Greenberg's proposal of the existence of a 
proto-Amerind have been recognized for a long time, and it is interesting and 
instructive to see what some of Greenberg's critics have made of these.  Two of 
the latter, Lyle Campbell and Terence Kaufman, have called them (for example, 
first person singular /n/, second person singular /m/) "pan-Americanisms" 
(1980), but what are "pan-Americanisms" but cognates by any other name?  In 
other words, common ancestry is always the simplest (in Occam's sense, 
requiring fewer events) explanation, and is to be rejected only when the 
evidence requires it.  But clearly many of Greenberg's critics do not recognize 
this basic tenet of science.  Bright (1984: 25) is representative:

"I would not be opposed to a hypothesis that the majority of recognized 
genetic families of American Indian languages must have had relationships of 
multilingualism and intense linguistic diffusion during a remote period of 
time, perhaps in the age when they were crossing the Bering Strait from Siberia 
to Alaska.  We can imagine that the so-called pan-Americanisms in American 
Indian languages, which have attracted so much attention from 'super-groupers' 
like Greenberg, may have originated in that period."     

A similar point was made by Levine in his 1979 doctoral dissertation on the 
position of Haida (pg 11):


Levine's removal of Haida from Na-Dene is then quoted approvingly by 
Greenberg's critics as a specific example of the failure of his approach.

But note that both Bright and Levine seemingly go out of their way to 
highlight the inherent flaw in their arguments.  Bright writes of 
"multilingualism and intense linguistic diffusion", and Levine of "extremely 
prolonged contact".  In other words, it isn't a small number of similarities 
that link Haida with the other Na-dene languages, and the many Amerindian 
languages with one another Q otherwise why the use of "intense" and "extremely 
prolonged" Q those similarities must be many and obvious, as Greenberg and 
Ruhlen keep emphasizing, seemingly to no avail.  And if they are there, then 
specific evidence of their resulting from diffusion has to be presented; 
otherwise retention from common ancestry is the only acceptable explanation.

The extent to which they are in fact there is perhaps best illustrated by 
reference to an exchange in the pages of the American Anthropologist between 
Witkowski and Brown, on the one hand, and two of Greenberg's severest critics, 
Campbell and Kaufman, on the other, concerning the relationship, or lack 
thereof, between two language groups of southern Mexico and Central America, 
Mayan and Mixe-Zoquean.  Witkowski and Brown, in their original article (1978), 
presented 62 putative cognate sets linking the two groups.  Campbell and 
Kaufman, in their first rejoinder, rejected all of these for various reasons, 
including 14 as being "so-called pan-americanisms".  They state (1980: 853):

 Greenberg lists many more (entries 120 and 156 in his dictionary).  Thus 
there is no argument about the nature of the evidence. The fact of widespread 
forms is not in dispute. And they can be truly widespread: Chumash is from 
southern California, Tehuelche is southern Patagonian. Nor are Campbell and 
Kaufman wrong in their argument concerning the applicability of such widespread 
forms to the question of the reality of a Mayan/Mixe-Zoquean grouping.  If they 
are "primitive" for the Americas, then they cannot be "derived" for a 
Mayan/Mixe-Zoquean clade.  But Campbell and Kaufman cannot have it both ways.  
If "pan-americanisms" cannot be used to define subgroupings within the 
Americas, they must, by the same token, be part of some larger unit; that is, 
"pan-American" (Greenberg's Amerind).  And what is often overlooked in this 
example is the fact that Brown and Witkowski limited their analysis to words 
using velars; that is, the initial sounds in "can" and "go".  These are then 
some small set of the total Mayan and Mixe-Zoquean vocabularies, meaning that 
the actual number of "pan-americanisms" to be discovered by surveying more 
broadly geographically, phonetically, and semantically, as Greenberg has done, 
must be very large indeed.

The same conclusion is reached through applying the Indo-European calibration 
discussed above to the American situation.  If we assume that there was no 
human occupation of the Americas prior to ~11,000-12,000 years ago, that 
proto-Amerind is real (that is, was spoken by the populations that initially 
colonized the Americas), and rates of loss of phonetic and semantic cognacy of 
7-10% per 1000 years, then we would expect that, on the average, a given 
proto-Amerind term would have a 30-40% chance of retaining sufficient phonetic 
and semantic form to be recognizable in an extant Amerind language.  Given the 
enormous diversity of known Amerind languages, then, it would appear very 
likely that a large proportion of the basic proto-Amerind vocabulary should be 
readily recoverable from surveys of the former Q especially if we avoid the 
naysaying temptation to require precise reconstruction, and are satisfied, at 
least at the beginning, with "something like".

I reemphasize here the fact that a scientist must choose a common ancestry 
explanation unless there is specific evidence against it.  Thus when Campbell 
and Kaufman, in their second rejoinder to Witkowski and Brown, state (1983: 

their words make it clear that whatever it is that they are doing, it isn't 
science Q though, it has to be noted here, it is, for much of recent American 
Indian linguistics, a perfectly representative statement.  Yet consider how 
much illogic it exhibits in so few words.  Campbell and Kaufman write, 
apparently oblivious to the import of their words, of "the legitimate practice 
in the investigation of remote relationships in the Americas of avoiding 
widespread forms."  How remarkably convenient Q if you don't like the 
conclusion, then just rid yourself of the only data which could possibly lead 
to it.  What, one wonders, would they say about a zoologist who wrote of "the 
legitimate practice in the investigation of remote relationships among 
organisms of avoiding certain widespread forms such as the presence of 
feathers, hair, tetrapod limbs, or amniotic eggs"?  How else are "remote 
relationships" to be investigated other than by documenting "widespread forms"? 
 Then they tell us that these "widely shared similarities may be due to 
onomatopoeia, sound symbolism, perhaps diffusion, accident, or other 
undetermined factors."  Well, yes, so they might Q indeed, we can be quite 
certain that all of these, including the "undetermined" ones, will have been 
involved, to some extent, in producing linguistic "similarities".  But the 
critical point here is that, in the absence of written records, there is no 
possible way of isolating and identifying those similarities resulting from 
"onomatopoeia, etc" until one has developed the phylogenetic tree linking the 
languages under study.  What that tree cannot explain, and there will always be 
a good deal that it cannot, is then to be looked at for evidence of 
"onomatopoeia, etc."  But it is obviously and inherently true that any 
similarity could be "explained" by appealing to these other factors.  It is 
just as obviously true that this is not the case for phylogenetic explanations. 
 The latter are falsifiable; the former are not Q or, more fairly, they are not 
until we have the tree of relationships. That most linguists writing on the 
subject refuse to take cognizance of these elementary tenets of the scientific 
enterprise is perhaps the most frustrating aspect of contemporary "discussions" 
concerning language relationships.   

 It also needs to be recognized that there is no good reason to believe that 
the distinction between intra-family and more distant relationships among 
languages represents any more of a real discontinuity with respect to 
linguistic differentiation than would that between, say, intra- and 
inter-generic relationships in the world of organisms.  In other words, the 
apparent discontinuities are artifacts of our system of classification, 
deriving from the vagaries of human thought and language which see categories 
as necessarily discontinuous.  That they are artifacts becomes obvious if one 
simply thinks about color or age, where categories and continuity are dual 
realities. This tendency is most readily exemplified in horizontal 
classifications (red, yellow; old, young), but clearly is also present for 
vertical ones (the taxonomy of organisms).  Thus if intra-family relationships 
among languages have an average time depth of perhaps 7,000 years, language 
families might begin to group into stocks only a little further back in time, 
and it should take no more than 2 further classificatory "steps" to tie 
together all languages (Greenberg, for example, recognizes only a further 14 
groupings about equal in overall diversity to Amerind).   There is therefore no 
reason at present to think that this would require a time depth of more than 
15,000 years, and this logic, along with that in the previous paragraph, 
provides a scenario that makes recent attempts to reconstruct etymologies 
linking widely separated language groups Q for example, Na-Dene and Caucasian, 
or Northern Nostratic (Indo-European, Uralic/Altaic, Korean/Japanese, 
Eskimo/Aleut, Amerind) Q much more reasonable and respectable.

I do, however, append one caveat here.  While it remains very likely that the 
search for a proto-Indo-European "homeland" is a reasonable one; that is, there 
probably was a homeland in the sense of PIE being spoken by a real population 
narrowly circumscribed in space and time, any such notion for Nostratic, 
Eurasian, Na-Dene/Sino-Tibetan/Caucasian, or other such high-level groupings, 
is much less realistic.  In other words, the scenario proposed in this article 
would require that one such as that Trubetskoy (1939) proposed for 

would in fact apply to some or all of the various recently proposed 
higher-level groupings Q though likely not, as just noted, for PIE itself.


In the last decade, it has become increasingly evident that a vast and 
critically important body of evidence concerning phylogenetic relationships 
among organisms occurs in the form of their mitochondrial DNA sequences.  The 
late Allan Wilson and his colleagues at the University of California at 
Berkeley have been major innovators in, and contributors to, this effort, in 
particular with respect to ape and human evolution.  Their basic conclusion 
that all extant human mitochondrial DNA sequences ultimately derive from a 
single female who lived in Africa some 200,000 years ago has been especially 
influential on both lay and scientific thinking about the evolution of Homo 
sapiens and of various populations within our species.  The mtDNA data have, as 
noted above, been generally, and I believe correctly, accepted as sounding a 
death knell to what had in recent years come to be called the multiregional (or 
regional continuity) model of evolution within Homo. The reason is that there 
is little in the way of geographical integrity in the data; that is, there is 
little or no tendency of populations living in areas with a long record of 
human occupation, say Europeans, to fall into clades containing only Europeans. 
 In this latter respect, of course, the mtDNA data conform to the general 
pattern of genetic diversity in our species Q where there is only a small 
increase in genetic distances between individuals when going from intra- to 
interpopulational comparisons.  They are also, in this sense, consistent with 
the scenario developed thus far in this paper.

But, obviously, the notion that all extant human mitochondrial sequences 
ultimately derive from a single female who lived in Africa some 200,000 years 
ago date, or anything like it, is not.  Nor are its implications for dating 
within-Homo sapiens relationships.  These lacks of fit, given that I couldnUt 
explain them, have long served to delay the formal appearance of this paper.  
Something was wrong somewhere, but neither I, nor anyone else, was at all clear 
as to just what it was for some 10 years after the appearance of CannUs 1982 
doctoral dissertation.  That dissertation is the ultimate source of the 
out-of-Africa 200,000 years ago scenario.  It is now becoming clear that the 
scenario is fundamentally flawed, and it is fair to say that we Q including, of 
course, this writer, should have seen how a long time ago.

The reason seems, by hindsight, straightforward.  Even in CannUs pioneering 
effort it was already evident that mtDNA evolution was, in one very important 
regard, producing results seriously at variance with those characterizing 
virtually all other molecular data sets.  This was the very high level of 
homoplasy (parallelisms and convergences) present at minimal levels of 
differentiation.  Thus Cann has to postulate some 378 actual mutations in her 
best tree (where the root is at ~0.6% sequence difference) to account for the 
162 differences she was actually able to observe.  In other words, each 
difference appeared, on the average, and with a very high variance, some 2.3 
times in the phylogenetic tree linking the 110 individual sequences with which 
she was working.  This was true even when she performed analyses using small 
subsets of her data.  For example, a typical result has 24 differences 
requiring a tree containing 63 mutations to account for them  (Cann, 1982, Fig. 

Thus a new mutation was more likely to occur at one of the 20 or so sites 
where one had already occurred than at the roughly 1500 where one had not! 

In other words, some small subset of sites is accumulating mutations at 
perhaps a 100 times the rate of the rest of them.  So it now looks Q and should 
have, but didnUt, in 1982 Q that there is a small number of hypervariable sites 
in the mtDNA sequence.  That this might be so was originally suggested by 
Kocher and Wilson (1991), and reinforced by Ward and Hasegawa at a Berkeley 
Workshop on Human Mitochondrial DNA (4 April 1992), and by Wakely and Meacham 
of the University of California at Berkeley in personal discussions.  

At these proposed hypervariable sites the bases which could appear [within the 
limitation that transitions (purine to purine, or pyrimidine to pyrimidine) are 
far more likely than transversions (purine<-->pyrimidine)] would randomize; for 
example, an original adenine would quickly just as likely become a guanine in 
various descendant lineages, and the position would contribute only noise to 
any subsequent phylogenetic analyses.  Such positions would then comprise an 
increasing proportion of the overall amount of observed difference as one went 
to more closely related individuals, and fatally compromise any attempts to 
calculate within-human divergence times based on that between the human and 
chimpanzee lineages.  All we can be reasonably certain of is that the actual 
base of the human mtDNA tree is much older than the 200,000 or so years given 
by Cann, Stoneking, and Wilson in their landmark 1987 Nature article.  The only 
published suggestion as to just how much older is by Wills (1993: 53-4):

But what has really concerned us here have been the dates for the other end of 
the human time scale.  It is that aspect of the mtDNA data,  resulting from a 
quite logical approach to dating the root of the human mtDNA tree developed by 
Stoneking and Wilson, has long been at apparent variance with the scenario 
proposed in this contribution.  In it they used a human population with a 
reasonably well-dated and relatively recent origin, and then assessed the 
levels of in situ sequence differentiation for the more rapidly evolving 
control region.  They argued as follows.  If one could date the entry of some 
human population into a previously unoccupied area, then the level of in situ 
differentiation observed among modern humans in that area would give us the 
rate of mtDNA change over that period of time.  This calibration could then be 
used to calculate the dates of other nodes in the overall human tree.  The 
original study (Stoneking, Bhatia, and Wilson, 1986) used data from New Guinea, 
and placed an upper limit (based on archeological evidence) on within-New 
Guinea differentiation of about 50,000 years.  The corresponding depth of the 
New Guinea-specific mtDNA clades (~0.7% sequence difference)  turned out to be 
about one-quarter of the that at the base of overall human tree (~3% sequence 
difference).  This put the root of the human tree at ~200,000 years ago (that 
is, 0.7/3.0x50,000), a date consistent with one obtained by calibrating in the 
other direction against the human-chimpanzee distance (but, as noted above, a 
distance we now know to be seriously in error). This internal consistency, 
though interesting, cannot, however, logically validate both the starting 
assumption and conclusion.  Once that is recognized, we can look with a more 
critical at the more recent results from two other areas of the world as well 
as New Guinea. 

Stoneking et al (1992) report that most of their 41 Papua New Guinea mtDNA 
types fall into 3 PNG-specific clades, each of which shows internal 
differentiation beginning at a sequence difference of ~0.7%.  It is thus of 
appreciable interest that DiRienzo and Wilson (1991) reported a similar pattern 
from their study of Sardinian and Middle Eastern individuals; that is, most of 
the branches in their tree originated in a "narrow interval of sequence 
divergence about two-thirds of the way from the root to the tips of the tree." 
And the "peak at the 0.5-0.75 level of percent sequence divergence suggests 
that the probability of survival of new mtDNA lineages changed dramatically 
during the evolution of modern humans."  Ward (1991, 1992) reports an almost 
identical situation for Native Americans, with internal in situ differentiation 
again beginning at ~0.7% sequence difference.  That these 3 instances of 
markedly increased likelihood of survival of mtDNA lineages, most probably due 
to rapid population increases, date to about the same level of sequence 
divergence is surely intriguing.  But why do we find such temporal synchrony 
for events about as distant from one another as possible geographically? 

For the New World, at least, there can be little doubt as to the cause Q here 
0.7% sequence divergence dates the enormous expansion of the ancestors of 
Native American populations as they colonized virgin territory.  Nor can there 
be any real doubt as to the date of this expansion.  Whatever might be the 
final fate of claims of pre-Clovis occupations in the Americas, the fact is 
that any such are at most very few in number, while there are hundreds of 
Clovis sites spread from California to New York and all the well-dated ones 
fall in a very narrow interval of time (~11.0-11.5kya) (Haynes, 1992). This is 
then the first DOCUMENTED major expansion of Amerind populations, and 
presumably dates the Amerind-specific mtDNA clades reported by Ward.    

For the Middle East and New Guinea, where we can be certain that the entries 
of Homo sapiens on the scene were much earlier than 11,000 or so years  ago, 
another cause (or causes) is required and also readily at hand.  The requisite 
and well-documented population-increasing innovation in those two areas broadly 
contemporary with the expansion of early Native Americans is obviously 
agriculture.  Thus it is the Native Americans who fulfill the requirements of 
the original Stoneking/Bhatia/Wilson approach, while in Papua New Guinea and 
the Middle East the number of mtDNA clades representing the much older 
occupations by hunter-gatherer humans would be very small in comparison to 
those deriving from the huge increase in human populations that agriculture led 

This then does away with the apparent disparities between the published mtDNA 
dates for recent evolution among certain Homo sapiens populations and those 
advocated in this contribution. The relative lack of geographical integrity in 
the overall trees, however, remains consistent with the idea of large scale, 
glacial-cycle-mediated, population movements characterizing much of the history 
of our genus subsequent to its exodus out of Africa.  These would then, I 
believe, tend to render the search for the geographical origin of the 
"mitochondrial Eve" a pointless exercise, and make her a statistical artefact 
of no biological significance.


Wills, in the last paragraph of the quote provided above, raises an issue that 
should not be ignored, for it certainly will not be ignored, whether implicitly 
or explicitly, in any discussion of the topics of concern here.  He argues that:

He is correct in this insofar as history is concerned.  The furor which 
erupted around the publication of CoonUs Origin of Races in 1962 took on 
precisely this aura Q but it took it on not because the multiple-origins model 
is inherently more racist, but because Coon made it so.  The basic thesis, in 
his words, is:

<..... in essence, that at the beginning of our record, over half a million 
years ago, man was a single species, Homo erectus, perhaps already divided into 
five geographic races or subspecies.  Homo erectus then gradually evolved into 
Homo sapiens at different times, as each subspecies, living in its own 
territory, passed a critical threshold from a more brutal to a more sapient 
state, by one genetic process or another.  (pg 658)>

Note here that the various erectus subspecies are seen as having passed their 
respective critical thresholds on the way to sapiensat different times.  Now 
add in the notion that the length of time in the sapiens grade will likely have 
something to do with how sapient you are, which Coon does, and the racism 
becomes apparent.  

But, as the late Glynn Isaac pointed out to me in a Berkeley seminar many 
years ago, it is the Garden of Eden model (I prefer the term Garden of Eden to 
NoahUs Ark for two reasons: First, and more trivially, I coined it in my 1971 
article; second, and far more important, it is much more congruent with the 
scenario it purports to characterize.), not that of multiple origins, which 
makes racial differences more significant functionally.  It does so because the 
amount of time involved in the raciation process is much smaller, while, 
obviously, the degree of racial differentiation is a fixed quantity, and, it 
needs to be noted here, at the level of morphology, apparently much larger than 
for any other mammalian species.  The shorter the period of time required to 
produce a given amount of morphological difference, the more 
selectively/adaptively/functionally important those differences become.  The 
Garden of Eden model in its earlier formulations envisioned perhaps 40,000 
years for raciation within anatomically modern Homo sapiens; the current 
formulations would at least triple that figure, and, thus, reduce the implied 
significance of racial differences.  Obviously the model I argue for in this 
article would increase that significance well beyond anything contemplated in 
recent years.       


The scenario just developed has a strong language origins/evolution corollary 
at each end of the time span with which we might be concerned here; that is, 
the time span of Homo.  Pushing our mitochondrial REveS much further back in 
time markedly increases the likelihood that the human mitochondrial tree does 
in fact trace our exodus from Africa, but it would then be that of Homo erectus 
some million or so years ago, and not that of Homo sapiens at a much later 
time.  It also then pretty much puts to rest any suggestions of an association 
of language origins (or Rlanguage as we know itS) with the origins of our 

But it is of course the other end of our time span that is of particular 
concern here.  While the repeated mixing of human populations proposed in this 
paper would render futile any attempts to trace language lineages back before 
the last mixing, it by the same token would argue that known human lexicons are 
far more strongly linked to one another than anyone to date has been willing to 
contemplate.   This increased likelihood of genetic relationship among the 
worldUs known languages would imply that any ultimate reconstructive efforts 
would have a great deal more to work with, and, therefore, be much more likely 
to succeed, than would otherwise be the case.  Just how far such efforts might 
get cannot, of course, be determined other than in practice.  I look forward to 
that practice.  


I have argued here that neither of the two scenarios/models, "regional 
continuity" and "Garden of Eden", within which the debate concerning the 
origins and diversification of "anatomically modern Homo sapiens" has been 
carried out throughout this century, can reasonably accommodate the various 
lines of evidence bearing on the questions involved.  I suggest that we need to 
factor into our thinking, in particular, the effects of glacial movements on 
human populations, and to recognize that, when this is done, the time scale 
which characterizes the development of existing interpopulational variation in 
our species becomes markedly removed from, and much younger than, that which 
characterizes the evolution of the species itself.  The resulting scenario 
would then allow us to resolve most of the serious remaining issues in these 
realms.  The "Neandertal problem" disappears.  The significance of the fossil 
record for "anatomically modern Homo sapiens" takes on a very different guise. 
Historical linguistics is put on a much firmer and more realistic foundation.  
The muddled mtDNA picture suddenly clarifies.

These are, obviously, very large claims.  But each has the virtue that it is 
readily testable.  Now all we need is the testing. 


I thank W W Howells and C Stringer for making available the measurements for 
the modern and fossil specimens, respectively; and the latter and H Kritscher 
of the Naturhistorisches Museum, Vienna, for allowing me to study the Tierra 
del Fugean material in their collections.  W Wang wrote a distance program and 
T Schoenemann one to allow direct access to the data of Howells Q both in True 
Basic.  All statistical analyses were carried out on a MacIntosh IIsi using 
Statview.  S Anton asked a critical question in a Berkeley seminar which 
identified for me the fundamental issue involved in assigning individuals to 
their correct populations.  C Stringer and T White provided particularly useful 
discussion and commentary.


R. Bateman, I. Goddard, R. O'Grady, V. A. Funk, R. Mooi, W. J. Kress, P. 
Cannell, Current Anthropology 31:1-24 (1990).

W. Bright, American Indian Linguistics and Literature, Mouton, Berlin and New 
York, 1984.

C. D. Buck, A Dictionary of Selected Synonyms in the Principal Indo-European 
Languages: a Contribution to     the History of Ideas, University of Chicago 
Press, 1949.

L. Campbell and T. Kaufman, On mesoamerican linguistics, American 
Anthropologist 82:850-857 (1980).

___________   Mesoamerican historical linguistics and distant genetic 
relationship: getting it straight, ibid  85:362-372 (1983).

R. Cann, The Evolution of Human Mitochondrial DNA, unpublished Ph.D. 
dissertation, University of      California at Berkeley, 1982. The search for 
Eve, Science 256:79 (1992).

R. L. Cann, M. Stoneking, A. C. Wilson, Mitochondrial DNA and human evolution. 
Nature 325:31-35 (1987).

L. L. Cavalli-Sforza, A. Piazza, P. Menozzi, J. Mountain,  Reconstruction of 
human evolution: Bringing        together genetic, archeological, and 
linguistic data.  Proc. Nat. Acad. Sci. U.S.A. 85: 6002-6005 (1988).

L. L. Cavalli-Sforza, A. Piazza, P. Menozzi, J. Mountain, Science 244: 
1128-1129 (1989).

L. L. Cavalli-Sforza, A. Piazza, P. Menozzi, J. Mountain, Curr. Anthro. 31: 
18-18 (1990).

G. Dcsy, The Indo-European Language: a Computational Reconstruction, 
Eurolingua, P.O. Box 101,        Bloomington, Indiana, 47402-0101 (1991).

A. Di Rienzo, A. C. Wilson,  Branching pattern in the evolutionary tree for 
human mitochondrial DNA,         Proc. Nat. Acad. Sci. USA 88:1597-1601 (1991). 

J. H. Greenberg, Language in the Americas, Stanford University Press, 1987.

M. Gusinde, Die Feuerlander Indianer, Verlag ANTHROPOS, Wien, 1939.

C. V. Haynes, Jr.  Contributions of radiocarbon dating to the geochronology of 
the peopling of the      New World, in Radiocarbon After Four Decades, A. Long 
and R. S. Kra, eds., Springer-Verlag, New        York,1992.   

W. W. Howells, Cranial Variation in Man, Papers of the Peabody Museum of 
Archaeology and Ethnology,       Harvard University, Cambridge, Mass, Volume 
67, (1973); Skull Shapes and the Map, ibid, Volume 79,   (1989).

T. D. Kocher and A. C. Wilson,  Sequence evolution of mitochondrial DNA in 
humans and chimpanzees:  control region and a protein-coding region.  in 
Evolution of Life  Fossils, Molecules, and Culture, S.   Osawa and T. Honjo, 
eds.  Springer-Verlag, Tokyo (1991).  pp 391-413.

A. L. Kroeber, Anthropology  (Harcourt, Brace and Company, New York, 1948).

R. D. Levine, The Skidegate Dialect of Haida, unpublished Ph.D. dissertation, 
Columbia University, 1977.

J. A. Matisoff, On megalocomparison, Language 66:107-120 (1990).

P. Mellars and C. B. Stringer, eds.,  The Human Revolution: behavioural and 
biological perspectives on the   origin of modern humans, Princeton University 
Press, 1989. 

R. T. O'Grady, I. Goddard, R. M. Bateman, W. A. DiMichelle, V. A. Funk, W. J. 
Kress, R. Mooi, P. F.    Cannell, Science 243:1651 (1989).

M. Ruhlen, A Guide to the World's Languages  Volume 1: Classification, 
Stanford University Press, 1987.
V. M. Sarich, Human variation in an evolutionary perspective, in Background 
for Man, P. Dolhinow and V.      Sarich, eds, Little Brown, Boston, 1971, pp 

F. H. Smith and F. Spencer, eds.,  The Origins of Modern Humans: A World 
Survey of the Fossil Evidence,   Liss, New York, 1984.

M. Stoneking, K. Bhatia, A.C. Wilson, Rate of sequence divergence estimated 
from restriction maps of         mitochondrial DNAs from Papua New Guinea, Cold 
Springs Harbor Symposia in Quantitative  Biology 51:433-439 (1986).

M. Stoneking, S.T. Sherry, A.J. Redd, L. Vigilant, New approaches to dating 
suggest a recent date for the    human mtDNA ancestor, Philosophical 
Transactions of the Royal Society London B 337:167-175 (1992).

 N. S. Trubetskoy, Gedanken uber das Indogermanproblem, Acta Linguistica 
1:81-89 (1939).  Translation from        C. Renfrew, Archeology and Language, 
Jonathan Cape, London, 1987, pg. 108.    

R. H. Ward, On the age of our mitochondrial ancestors: Evidence for deep 
lineages in      "Mongoloid populations".  American Journal of Physical 
Anthropology Supplement 12:180-181 (1991).

C. Wills, The Runaway Brain, Basic Books, 1993.

S. R. Witkowski and C. H. Brown, Mesoamerican: A Proposed Language Phylum, 
American Anthropologist  80:942-944 (1978).

___________   Mesoamerican historical linguistics and distant genetic 
relationship, ibid 83:905-911 (1981)


The mean z-scores, calculated as described in the text, between each of 33 
fossil and 5 recent human populations.  The amounts-of-change adjustments 
present are:  Norse, +0.19;  Zulu, 0; Tolai, -0.01;  Anyang, -0.04; Santa Cruz, 
- 0.13.  The smallest distance or distances within 0.05 units of one another 
are highlighted.  Afalou 29, as noted in the text, is effectively equidistant 
from all 5 populations.



Steinheim       2.31    2.32    1.95    2.26    1.66
Petralona       2.47    2.60    2.34    2.64    1.66
Kabwe   1.94    2.01    1.85    2.05    1.20
Ferrassie       1.64    1.70    1.70    1.80    1.23
Monte Circeo 1  2.00    2.15    1.93    2.28    1.23
La Chapelle     1.96    1.98    1.88    2.23    1.36
Saccopastore 1  1.90    2.15    2.23    2.32    1.47
Shanidar 1      1.37    1.36    1.54    1.69    1.41
Gibraltar       1.78    1.99    1.65    2.09    1.17
Amud    1.99    1.97    1.97    2.24    1.39
Irhoud 1        1.60    1.63    1.39    1.73    0.88
Skhul 5 1.60    1.67    1.38    1.73    1.01
Qafzeh 6        1.35    1.40    1.32    1.22    0.83
Qafzeh 9        1.06    0.76    0.86    1.15    1.10
CroMagnon 1     1.02    1.18    1.11    1.41    1.08
Mladec 1        1.01    1.01    1.27    1.28    1.41
Predmost 3      1.20    1.22    1.05    1.42    1.04
Kostenki 14     1.27    0.96    0.82    1.21    1.23
Pataud  0.91    1.07    1.16    0.96    0.79
Chancelade      1.10    1.09    1.07    0.83    1.45
Oberkassel male 1.25    1.36    1.04    1.12    0.78
Oberkassel female       1.05    1.01    0.91    1.05    1.30
Candide 1.37    1.45    1.41    1.38    1.06
Candide 1       0.98    1.06    1.09    1.05    1.16
Candide 5       1.03    1.17    1.07    1.12    1.25
Afalou 9        1.23    1.45    1.30    0.76    0.90
Afalou 10       0.83    1.07    1.29    0.99    0.84
Afalou 29       0.96    0.99    1.03    1.00    1.02
Afalou 32       1.15    1.11    1.03    0.99    0.84
Taforalt 11     1.19    1.21    0.70    1.01    1.29
Taforalt 17     1.18    1.07    0.85    1.06    1.18
Cohuna  1.98    1.80    1.41    1.88    1.52
Upper Cave 101  1.17    1.23    0.97    1.41    0.89