TY - JOUR
T1 - Gender Stereotypes in Natural Language
T2 - Word Embeddings Show Robust Consistency Across Child and Adult Language Corpora of More Than 65 Million Words
AU - Charlesworth, Tessa E.S.
AU - Yang, Victor
AU - Mann, Thomas C.
AU - Kurdi, Benedek
AU - Banaji, Mahzarin R.
N1 - Publisher Copyright:
© The Author(s) 2021.
PY - 2021/2
Y1 - 2021/2
N2 - Stereotypes are associations between social groups and semantic attributes that are widely shared within societies. The spoken and written language of a society affords a unique way to measure the magnitude and prevalence of these widely shared collective representations. Here, we used word embeddings to systematically quantify gender stereotypes in language corpora that are unprecedented in size (65+ million words) and scope (child and adult conversations, books, movies, TV). Across corpora, gender stereotypes emerged consistently and robustly for both theoretically selected stereotypes (e.g., work–home) and comprehensive lists of more than 600 personality traits and more than 300 occupations. Despite underlying differences across language corpora (e.g., time periods, formats, age groups), results revealed the pervasiveness of gender stereotypes in every corpus. Using gender stereotypes as the focal issue, we unite 19th-century theories of collective representations and 21st-century evidence on implicit social cognition to understand the subtle yet persistent presence of collective representations in language.
AB - Stereotypes are associations between social groups and semantic attributes that are widely shared within societies. The spoken and written language of a society affords a unique way to measure the magnitude and prevalence of these widely shared collective representations. Here, we used word embeddings to systematically quantify gender stereotypes in language corpora that are unprecedented in size (65+ million words) and scope (child and adult conversations, books, movies, TV). Across corpora, gender stereotypes emerged consistently and robustly for both theoretically selected stereotypes (e.g., work–home) and comprehensive lists of more than 600 personality traits and more than 300 occupations. Despite underlying differences across language corpora (e.g., time periods, formats, age groups), results revealed the pervasiveness of gender stereotypes in every corpus. Using gender stereotypes as the focal issue, we unite 19th-century theories of collective representations and 21st-century evidence on implicit social cognition to understand the subtle yet persistent presence of collective representations in language.
KW - collective representations
KW - gender stereotypes
KW - machine learning
KW - natural-language processing
KW - open data
KW - open materials
KW - word embeddings
UR - http://www.scopus.com/inward/record.url?scp=85098932774&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85098932774&partnerID=8YFLogxK
U2 - 10.1177/0956797620963619
DO - 10.1177/0956797620963619
M3 - Article
C2 - 33400629
AN - SCOPUS:85098932774
SN - 0956-7976
VL - 32
SP - 218
EP - 240
JO - Psychological Science
JF - Psychological Science
IS - 2
ER -