LANGUAGE AND SPEECH, 1977, Vol. 20, Part 4. Pages 404 - 409





Carnegie-Mellon University


This paper describes an experiment designed to test the hypothesis that women have larger colour vocabularies than men.  The results indicate that they do.  The results also indicate that, in at least one social class, younger men have larger colour vocabularies than do older men.  No such difference exists for women.  However, a group of Catholic nuns did score lower than the rest of the women but still higher than the men.




   It is a widely held belief that women have larger colour vocabularies than do men. For example, Robin Lakoff (1975) states this as a fact and suggests as an explanation the observation that in this society women spend much more of their time on colour-related activities such as choosing clothes than men do.  The purpose of our study was to see whether women really do use a wider array of colour terms than men do by presenting colours to both men and women, asking them to name them, and then measuring the size of the vocabularies they use. 

At least two related types of observations have been reported in the literature. The first deals with differences between men and women on other colour-related tasks; the second involves other differences between the language of men and that of women, suggesting that if men and women do differ in their vocabulary of colour, it would not be the only area in which their languages differ. 

The Wordswoth-Wells colour naming test (Wordsworth and Wells, 1911) tests the speed of recognition of standard colours. Subjects are presented with a card showing 100 patches of colour each 1 cm. square. Each patch is either red, yellow, green, blue, or black.  The subject is timed as he names the colours of the patches in order. Wordsworth and Wells reported that among college students women do better at the task than men, i.e., they require less time.  Ligon (1932) discovered that among children in grades one through nine girls do better on the Wordsworth-Wells test than do boys. He also showed that, except in the first two grades, the sex difference was greater on the colour-naming test than on a test of word reading designed to measure general verbal fluency, on which girls also did better than boys. This study shows that at least some of the differences between men and women are acquired at a very early age.

There is a large amount of evidence that the language of women is not always the same as the language of men. The anthropological literature abounds in instances of sexual differentiation of language among so-called primitive people.  Jespersen (1922) discusses the language of the Caribbeans of the small Antilles, in which about one tenth of the vocabulary is different for women than for men. The differences occur primarily in kinship terms, names for parts of the body, and also in isolated words such as friend, enemy, joy, work, war, house, garden, bed, poison, tree, sun, moon, sea, and earth. In Koasati, an American Indian language (Haas, 1944), men's and women's speech differ in some forms of verbal paradigms. 

It has long been recognized that in English, men's and women's speech differ with respect to the use of swear words and euphemisms.  There is evidence that other differences exist as well.  Barren (1971) reports a difference between the speech of men and women in the relative frequency of various cases. 

This paper describes an experiment that was conducted to determine whether colour vocabulary is another area in which men's and women's speech differ.  




A set of 25 cards was constructed by colouring a two-inch square in the centre of each of 25 3x5 cards. The squares were coloured with single crayons selected from Crayola's box of 64 crayons. No crayon was used more than once. 

Each subject was shown the cards one at a time and asked to state the word or phrase he would use to describe the colour.  In order to standardize the task, each subject was told that he should imagine himself in the following situation: 

"You have bought a shirt and now want to buy a pair of pants to match the shirt. You go into a store but haven't got the shirt with you. You want to say to the sales-person, 'I have a —— shirt. Show me a pair of pants to go with it.' "

The subjects were also told that they should attempt to describe the cards as independently as possible, that they should not compare them to each other, and that it was acceptable to give the same name to more than one card. 

The responses were recorded and then scored using a scheme designed to measure the extent of the subjects' colour vocabularies. The responses were divided into four categories:

(1)   Basic—one of the following basic colour words: red, orange, yellow, green, blue,      purple, violet, white, black, brown, grey, pink, tan.

(2)   Qualified—a basic word qualified by words such as light or dark or by another      basic word, e.g. yellowish green. Responses in this category are more specific than basic responses but they do not actually show a larger vocabulary.

(3)   Qualified Fancy—a basic word qualified by special words, such as sky blue or      hunter green.

(4)   Fancy—colour words not in the basic category, such as lavender, magenta, and      chartreuse.

A score for each subject was computed by assigning one point for each basic response, two for each qualified, three for each qualified fancy, and four for each fancy response. Since there were 25 cards, the possible scores range from 25 to 100. 

The subjects were divided into five groups on the basis of age, sex, and occupation as follows: 

Group I: men aged 20-35. Graduate students or people working in technical areas. 

Group II: men aged 45-60. All technically trained, highly educated professionals. 

Group III: women aged 20-35. Further divided into two groups: 

A: technical—corresponding to Group I. 

B: non-technical but well-educated. 

Group IV: women aged 45-60. Most of them married to the men in Group II. 

Group V: Catholic nuns. Most of them over 30. 

The Mann-Whitney U test (Siegel, 1956) was used to determine, on the basis of the observed scores, the probability that the scores of one group were stochastically higher than those of another group. 

The groups ranged in size from seven to 24 subjects. The size of the groups is taken into account in the Mann-Whitney test.



  Table 1 displays the median scores for each of the five groups. It suggests that:

(1)   Women use fancier words than men.

(2)   Younger men use fancier words than older men.

(3)   All the women have similar size vocabularies except the nuns, who use fewer fancy words than the other women.

The Mann-Whitney test indicates that these differences are highly significant. Table 2 shows the significance levels obtained for the hypotheses that certain groups score higher than others. The following comparisons yielded no significant difference:

(1)   Technical v. non-technical young women.

(2)   Young women v. older women.

Because the only significant difference among the women was between the nuns and the non-nuns, groups III and IV will be combined for the rest of this discussion.

Table 3 shows the average number of times the members of each of the groups used each category of colour word.  It shows that the women used more qualified fancy and fancy words than did the men, and the older men used significantly fewer fancy words than did the younger men.  It also shows that the nuns used fewer fancy words than did the lay women.

Another measure of breadth of vocabulary is the number of times the same term was used to describe different colours. Table 4 shows the mean number of times a colour was described exactly the same way as a previous colour. The older men used the greatest number of repetitions, followed by the younger men, the nuns, and then the rest of the women.  Thus both the fanciness score and the repeat count produce the same ordering of the groups.



Table 1





(young men)



(older men)



(young women)



   A (technical)



   B (non-technical)



(older women)







Table 2



III + IV > I + II (women > men)


I > II (young men > older men)


IV > II (older women > older men)


IlIa > I (young tech women > young tech men)


III + IV > V (other women > nuns)




Table 3




qual. fancy


I + II (all men)





I (young men)





II (older men)





III + IV (lay women)





V (nuns)







Table 4


Number of Repeats

I + II (all men)


I (young men)


II (older men)


III + IV (lay women)


V (nuns)




It was suspected at the start of the experiment that factors other than sex might have a significant effect on people's colour vocabularies. For that reason, the groups were further subdivided by age and occupation. It is very difficult, however, to construct samples with no differences other than sex since, in this culture, sex is so highly correlated with other things.  For example. Groups II and IV differ by sex, but also, not coincidentally, in the occupations of the people, the men working at technical jobs, the women having raised children.  In fact, it has been assumed (for example, by Lakoff) that such sex-correlated differences are the reason for the differences in colour vocabulary. Women spend more time buying clothes and decorating living rooms. This study shows, however, that even when the principal occupation is the same (Group I v. Group IIIa) the women show a larger colour vocabulary than the men. 

The fact that the nuns score lower than the rest of the women also suggests that such cultural factors are significant. Not only do the nuns spend less time worrying about clothes (the ones in this experiment still wear habits) than do the other women, they are people who chose to give up such things.  Both the fact that the nuns do score higher than the men and that women score higher than men, even if their current principal occupation is the same, suggest that this difference is determined quite early in life before adult occupations are chosen. 

The difference between the young men and the older men was surprising. There are at least two possible explanations for this observation.  One is that the older men at one time had larger colour vocabularies but over the many years they have been married and therefore had someone else to buy their clothes and decorate their living rooms, their vocabularies have atrophied. The other explanation is that younger men have larger colour vocabularies than the older men ever had because sex stereotyping is dwindling in this society and men are increasingly interested in such things as clothes. The data obtained in this experiment provide no way to decide between the two. 

The goal of this experiment was to measure size of active vocabulary. It is difficult to do precisely that in an experimental situation where people are explicitly asked to name colours. Such a situation was necessary, however, in order to get each subject's reaction to many different colours.  The method chosen almost certainly produces a bias toward more exotic descriptions than the subjects would use in an everyday situation.  However this bias is constant across all groups of subjects and should therefore not significantly affect the relative scores of the various groups.


The evidence collected in this experiment confirms the hypothesis that women have more extensive colour vocabularies than men. It also indicates that, at least in one social class, younger men have larger colour vocabularies than older men.



Barron, N. (1971). Sex-typed language: the production of grammatical cases. Acta Sociologica, 14, 24-42.

DuBois, P. H. (1939). The sex difference on the color-naming test. Amer. J. Psychol., 52, 380.

Haas, M. (1944). Men's and women's speech in Koasati. Language, 20, 142-9.

Jespersen, 0. (1922). Language: Its Nature, Development, and Origin (New York), chap. 13.

Lakoff, R. (1975). Language and Woman's Place (New York).

Ligon, E. M. (1932). A genetic study of color naming and word reading. Amer J Psychol  44 103-22.

Siegel, S. (1956). Nonparametric Statistics for the Behavioral Sciences (New York).

Woodworth, R. S. and Wells, F. L. (1911). Association tests. Psychological Monographs, 57, 1-80.