Androgyny Study Based on Sandra Bem’s work (Intellectual Androgyny)

I discovered while taking a course in Human Relations at SFSU in the Masters in Public Administration program that I am “androgynous.” Not that you can’t tell that I am a woman by looking at me, but rather that my brain operates successfully in areas considered both male and female. It is an advantage in life, as explained below.

The Measurement of Psychological Androgyny

Sandra L Bem (1974)

Bem views Androgyny to mean both highly masculine and highly feminine, rather than neither masculine nor feminine (i.e. midway between the two extremes).

It’s better to be androgynous in today’s society as men and women need to be adaptable, and willing to share all types of jobs, without saying that one job is woman’s work or another job is just for men.


The final inventory consists of three groups of 20 items. The three groups being Masculine, Feminine and Neutral.

Selection of items

Two Hundred personality traits that were masculine or feminine, but were neutral in social desirability were chosen by Bem and some of her students. Likewise a further 200 personality traits that were either socially desirable or socially undesirable were chosen. These traits had to apply equally to both males and females.

One Hundred students then rated these traits by answering the following question format: “In American society, how desirable is it for a man to be truthful?”. Half of the students (50) were asked about men, and half were asked about women. There was an equal number of men and women in both groups. {Note the word ‘truthful’ was replaced with all the other 399 trait words in turn.}. The scale used was 1 = not at all desirable, 7 = extremely desirable. Only items that were judged to be significantly more desirable for one gender than the other were considered for inclusion in the inventory (BSRI). Both male and female judges had to agree. Likewise, neutral items were those that were judged to be neither masculine nor feminine, by judges of both genders. The twenty most ‘feminine’ traits, the twenty most ‘masculine’ traits, the ten most neutral (for gender) undesirable traits and the ten most neutral desirable traits make up the BSRI (Bem Sex Role Inventory).

Note the good methodology. Bem uses male and female judges in equal proportions. Both have to agree. Four Hundred potential traits are cut down to 60. So plenty of items were considered. One Hundred judges should overcome the problem of having a biased sample. However, note the judges were university undergraduates. This means most were between the ages of 18 and 21. Note also how Bem ensures she has a test with face validity, because the items are selected on their face value in the first place! You might be tempted to criticise Bem’s sentence that is used by the judges. The sentence does ask for a judgement of desirability. This could lead to subjects giving high masculinity ratings and high femininity ratings. In turn this would lead to a low androgyny score (which means highly androgynous). If this is so experimenter bias has taken place. Bem would demonstrate that androgyny is more prevalent than it really is. Bem did find from her normative data (see below), that although social desirability (as determined from her ‘neutral’ items) correlated with masculinity and femininity, it did not correlate with androgyny.


The actual inventory used to appear here but I have been asked to remove the inventory by the copyright owners. 

(NB–There is an androgyny test at .  I score 90% masculine and 73% feminine.  I am a heterosexual female who delights in keeping up my nail polish and other feminine froufrous, and am in appearance quite feminine.  No, I will not link you to my results or my profile, unless you are a single man living in the Bay area and interested in dating a polymath.  OKCUPID is free, fun, and full of geeks and nerds.)

Number the adjectives with a number from 1 to 7, reflecting the degree to which you think the adjective applies to you.

1 Means always or almost always untrue

7 means always or almost always true.

4 means half true and half untrue.

Try to use all the numbers in the scale.

The scale is supposed to be scored by using an inferential statistical test (the t test, but you don’t have to know anything about this test!). The t value, derived from the test, represents the difference in assessment between the masculine and feminine items. For simplicity the androgyny score can be defined as the average feminine rating minus the average masculine rating. Note that it is a little confusing because a low androgyny score means that the subject is highly androgynous, and vice versa.

Bem has built into her scale a check to see whether it is likely the subject is simply trying to give a favourable impression of themselves. If the ten socially desirable ‘neutral’ items are given high ratings, and the ten socially undesirable ‘neutral’ items are given low ratings, then it is likely the subject is trying to give a good impression of themselves. Another advantage of the neutral items is that it helps to pad out the inventory somewhat; This reduces demand characteristics. Subjects will be less aware of which category each item belongs to.

Normative Data.

It is important to collect normative data from a large number of subjects, so that an average score and standard deviation can be obtained for each dimension of the scale (that is masculinity, femininity, androgyny and Social Desirability. Only by comparison to others can an individual’s score have any meaning. For example, an IQ score of 100 is meaningless, unless you know that the average IQ score is also 100. Students, mostly between the ages of 16 and 21 were used; 444 male and 279 female students from Stanford University, and 117 male and 77 female students from Foothill Junior college. This may mean that the normative data might only apply to this age group. Perhaps more pertinently, the students of this time (1974) would probably have held different values from those students of today. Would students of today rate ‘conventional’ as socially undesirable? If your teacher is over 40, ask them to describe what it was like in the seventies! Additionally, these were Americans; How would British students rate the BSRI? Would they see ‘aggression’ as a desirable masculine trait, for example?


social desirability

Bem found that although there was a high correlation between masculinity or femininity and social desirability, there was no correlation between androgyny and social desirability. This shows that the inventory is measuring something other than social desirability.


Twenty-Eight males and 28 females are retested after an interval of four weeks. High correlations are found for Femininity, Masculinity, Androgyny and Social Desirability.

Split-Half or internal Consistency

For both samples (Stanford and Foothill), half of the Femininity items correlated with the remaining half. The same goes for Masculinity, Androgyny and Social Desirability.


Face validity

Because the trait items were thought to be desirable for either men or women by 100 judges, one could say that face validity was built into the BSRI. It is a pity that the judges were not asked to judge the ‘neutral’ traits for social desirability. Those traits that fell half-way between masculine and feminine were used for the social desirability scale. It was the initial selectors (Bem and Students) who decided upon whether the items were desirable or not. Can we be sure about the social desirability of traits such as ‘unpredictable’ or ‘conventional’?

Content Validity

The items on the test were not studied.

Criterion-Related Validity


Bem did not try to find out whether each person was in real life as the BSRI predicted. Are highly androgynous men happy to change a baby’s nappies?


Bem would have had to wait to see whether all of her androgynous people were successful in life. Bem felt that it was best to be androgynous, as one could be more adaptable to the demands of modern life. Perhaps it is now the time to test for predictive reliability! How would you go about that?

Correlations with the other measures of masculinity-femininity

The BSRI results were compared against results obtained from the California psychological inventory and the Guilford-Zimmerman temperament survey. There was no correlation between the BSRI and the Guilford-Zimmerman scale and only a moderate correlation between the BSRI and the California psychological inventory. This suggests that Bem is measuring something different from the other personality tests.

Read the study at the link above, and see other source material.

This entry was posted in Mental States. Bookmark the permalink.

Comments are closed.