[2]\fnmAkira \surMatsui

1]\orgnameIndiana University Bloomington

2]\orgnameKobe University

Quantifying Gender Stereotypes in Japan between 1900 and 1999 with Word Embeddings

\fnmShintaro \surSakai [email protected]    \fnmHaewoon \surKwak [email protected]    \fnmJisun \surAn [email protected]    [email protected] [ [
Abstract

We quantify the evolution of gender stereotypes in Japan from 1900 to 1999 using a series of 100 word embeddings, each trained on a corpus from a specific year. We define the gender stereotype value to measure the strength of a word’s gender association by computing the difference in cosine similarity of the word to female- versus male-related attribute words. We examine trajectories of gender stereotype across three traditionally gendered domains: Home, Work, and Politics, as well as occupations. The results indicate that language-based gender stereotypes partially evolved to reflect women’s increasing participation in the workplace and politics: Work and Politics domains become more strongly female-stereotyped over the years. Yet, Home also became more female-stereotyped, suggesting that women were increasingly viewed as fulfilling multiple roles such as homemakers, workers, and politicians, rather than having one role replace another. Furthermore, the strength of female stereotype for occupations positively correlate with the proportion of women in each occupation, indicating that word-embedding-based measures of gender stereotype mirrored demographic shifts to a considerable extent.

keywords:
Word embeddings, Gender stereotypes, Historical corpora, Cultural transformation

1 Introduction

The growing availability of large digitized corpora has led to the increasing use of word embeddings in the social science research [1]. Prior studies have demonstrated that word embeddings contain various problematic biases (e.g., the vector for doctor is closer to the vector for man than to woman) [2, 3]. Instead of removing bias from these embeddings, researchers in social science fields leverage the associations between words encoded within them to quantify social stereotypes or representations of various social groups [3, 4, 5, 6, 7]. Previous literature have shown that stereotypes quantified through word embeddings strongly correlate with those measured with opinion surveys and the Implicit Association Test (IAT) [3, 8, 4, 9, 10]. Word embeddings have also been shown to capture shifts in stereotypes over time, aligning with findings from traditional social science research [4, 6, 5].

Word embeddings are a powerful tool, since they can capture stereotypes from historical corpora when survey or psychological test data are not available. Traditional methods, such as opinion surveys [11, 12] and the IAT [13], rely on human participants, making it impossible to assess stereotypes from a century ago. In contrast, word embeddings can quantify historical stereotypes by leveraging textual data produced during specific historical periods. Studying changes in stereotypes is also possible by collecting the results of opinion surveys or IAT at different points in time. Although longitudinal studies using repeated surveys or IATs can also reveal changes in stereotypes, such data are rarely available across a long period of time or across diverse countries and cultures. Moreover, the reliability of these methods can be compromised by shifts in survey design, sampling biases, and respondents’ reluctance to endorse socially undesirable views [14, 15].

While numerous studies have sought to quantify social stereotypes using word embeddings [3, 8, 4, 9, 10, 5, 6], the majority focus on Western contexts and largely neglect Asian or non-Western settings. In particular, Japan offers unique analytical value for studying stereotype changes since it has undergone profound and rapid societal transformations between 1900 and 1999. Following its defeat in World War II, Japan experienced a dramatic increase in Western influence, which significantly impacted social norms in Japan. Especially, gender roles in the workplace and politics were a key area of these influences; the US-led allied force imposed structural changes to dismantle Japan’s patriarchal system to raise women’s status in the workplace and politics. Despite the unique trajectory of gender norm changes in Japan, There has been no quantitative study on gender stereotypes in Japan during this period, making it impossible to directly observe how gender stereotypes evolved [16, 17, 18, 19].

To address this gap, the present study quantifies changes in gender stereotypes between 1900 and 1999 using word embeddings trained on Japanese historical books and magazines. We measure gender stereotypes associated with terms representing the domains of Home, Work, and Politics, as well as 18 occupational terms. These three domains were chosen because gender norms in the workplace and politics in Japan underwent particularly substantial changes during this period. To validate whether the computed gender stereotypes reflect broader societal trends, we compare gender stereotype in the 18 occupational terms with the actual representation of women in these occupations. These 18 occupational terms were selected for their consistent occurrence across all 100 embedding models and for having at least one entry in the Japanese national census.

Our contributions are as follows: (1) To our best knowledge, this is the first study quantifying gender stereotype of Home, Work, and Politics domains, and occupations in Japan between 1900 and 1999. We provide empirical evidence of how gender stereotype evolved in Japan over the 20th century, offering insights into the extent to which top-down legal reforms implemented by the US-led Allied forces influenced women’s representation in Japanese historical texts. (2) We make our trained word embedding models publicly accessible. These models can be used for future research on historical change in Japan such as quantifying semantic shifts or other types of social stereotypes.

2 Background

This section reviews key seminal studies that establish word embeddings as a robust computational method for quantifying shifts in social stereotypes. We then provide brief historical context on the rising status of women in Japan.

2.1 Word embeddings as a tool to measure changes in social stereotypes

Word embeddings have emerged as a powerful computational tool for capturing historical shifts in social stereotypes. A foundational work by Garg et al. [4] quantified changes in gender and ethnic stereotypes in the United States from 1910 to 1990. Their analysis revealed that the changes in occupational female bias in word embeddings closely mirrored actual changes in the percentage of women in those occupations, as recorded by the U.S. Census. This was further validated by historical survey results of gender stereotypes. Similarly, they demonstrated that changes in occupational Asian bias in word embeddings correlated with the changing percentage of Asians in those occupations and aligned with ethnic stereotype survey data. These findings reflect broader cultural shifts, such as the women’s movement (1960s-1970s) and the growth of the Asian-American population (1960s-1980s) in the U.S.

Further research has expanded on these insights, exploring different facets of stereotype change. Kozlowski et al. [6], for instance, examined how gender stereotypes associated with specific occupations like “engineer”, “journalist”, and “nurse” changed from the 1900s to the 1990s. Their findings indicated that “nurse” became less female-associated and “engineer’ less male-associated over time, yet both occupations retained a persistent gendered stereotype. Interestingly, “journalist” transitioned from being male-stereotyped until the early 1950s to becoming female-stereotyped thereafter. These findings align with the broader decline in occupational gender segregation in the 20th century.

Beyond specific occupations, studies have also investigated changes across broader domains. Jones et al. [5] quantified shifts in gender stereotypes across “career”, “family”, “science”, and “arts”, observing a consistent decline in overall gender bias. Despite this, some biases persisted: “career”, and “science” terms remained more strongly associated with men, while “family” and “arts” continued their stronger association with women. These findings were broadly consistent with observed societal trends.

Bhatia and Bhatia [15] approached the topic using established psychological scales to measure gender stereotypes through word embeddings. They concluded that gender biases have generally diminished throughout the 20th century, particularly for stereotypically feminine and personality-related traits. This aligns well with existing psychological theories on gender stereotypes.

More comprehensively, Charlesworth et al. [7] explored historical representations of various social groups, including racial, gender, age, body type, and socioeconomic groups, across 200 years of Google Books data (1800-1999). Their extensive analysis revealed that while the specific words and traits most associated with these groups changed significantly over time, the average valence (positivity or negativity) of these stereotypes often remained remarkably stable. This suggests a dynamic evolution of stereotype content even as underlying sentiment persists.

In summary, the growing body of research utilizing word embeddings has shown their effectiveness for quantifying the evolution of social stereotypes across historical periods. These studies demonstrate how word embeddings can capture nuanced changes in gender, ethnic, and other social group stereotypes, often aligning with real-world demographic shifts and validated psychological theories.

2.2 The changes in women’s status in Japan

Although women’s status had been persistently low in Japan, Japan experienced several reforms by the US-led allied force after World War II which challenged traditional gender norms [16, 17, 18, 19].

Before 1900, public advocacy for women’s rights in Japan was heavily suppressed. Direct feminist activism was rare due to government censorship, and early advocates faced arrest or silencing [17]. After around 1910, Japanese women began actively challenging traditional gender norms and exploring new forms of self-representation. Feminist publications and media sparked debates on women’s independence, education, sexuality, and professional opportunities. Discussions of issues such as abortion, chastity, and women’s roles in modern occupations reflected a growing engagement with questions of personal autonomy and social participation [17]. In 1919, the New Women’s Association was founded, demanding suffrage, legal reforms, and labor rights [17]. Transnational women’s organizations like the Young Women’s Christian Association (YWCA) and the Woman’s Christian Temperance Union (WCTU) played active roles in promoting women’s welfare. However, despite their efforts, women were still denied suffrage, and their working conditions saw little improvement through the prewar period [20].

Following World War II, women’s status in Japanese politics and the workplace saw significant advancements due to reforms implemented under the US-led allied force [17]. In 1946, women were granted the right to vote for the first time, with nearly two-thirds of eligible women participating in the election. That same year, female representatives were elected to the Diet. Some also contributed to drafting the postwar Constitution, which enshrined gender equality as a constitutional principle. Women’s labor conditions also improved: more women entered the workforce, and in 1947, the Ministry of Labor established the Women’s and Minors’ Bureau to protect female workers. Article 14 of the 1947 Constitution guarantees equality regardless of gender 111https://www.japaneselawtranslation.go.jp/en/laws/view/174/tb. The Labor Standards Law of 1947 prohibits gender-based disparities in wages, working hours, and conditions 222https://www.japaneselawtranslation.go.jp/en/laws/view/3567/en. Later, the Equal Employment Opportunity Law of 1985 further advanced gender equality by banning discrimination in hiring and promotion, as well as penalizing disadvantageous treatment due to marriage, pregnancy, or childbirth.

However, despite top-down legal reforms, women’s status in the Japanese workplace is still quite low in Japan. Human resource management systems in many companies still disproportionately favor men for promotions, resulting in an extremely low proportion of women in managerial positions (around 9.1% in Japan compared to 42.5% in the United States in 2005) [21]. This inequality also contributes to a substantial gender wage gap [22]; for example, women in Japan earned approximately 34% less than men in 1999, and the country was ranked the second worst in the 1999 data 333https://www.oecd.org/en/data/indicators/gender-wage-gap.html.

Women’s underrepresentation extends beyond managerial positions to certain professional occupations. They are concentrated in lower-paying service professions, while remaining largely absent from higher-status professions such as engineering and law. For instance, there were only 1.9% of women in higher-status professions in 1995, compared to 9.4% men [23]. Furthermore, the high turnover rate of women during child-rearing years exacerbates the problem: in 2010, 60% of women quit their full-time jobs within six months of giving birth [21].

Women in Japan are also still significantly underrepresented also in politics. In 2019, women made up only 10% of the national Lower House, 21% of the Upper House, and 14% of local assemblies. This low percentage is a persistent problem, as the percentage of women in the Lower House has increased by less than two percentage points since women gained the right to vote in 1946 [24]

In summary, although Japan experienced a unique trajectory in advancing women’s status in the workplace and politics through legal reforms after World War II, these top-down changes seem to have only a limited impact on women’s actual positions in the workplace and politics.

3 Research questions

Building upon the background reviewed in Section 2, we set the research questions as follow. Although Japan experienced a unique trajectory in promoting women’s status in the workplace and politics through legal reforms after World War II, these top-down changes appear to have had only a limited effect on women’s actual positions. Prevailing social attitudes and gender norms seem to have persisted, constraining improvements in women’s real-world positions. This raises the question of how these reforms were reflected in gender stereotype, as discerned through historical texts: RQ1. How do female stereotype of the Home, Work, and Politics domains change between 1900 and 1999? and RQ2. Does the female stereotype of the Home domain remain higher than that of the Work and Politics domains across the years? While analyzing stereotypes in the broad domains of Home, Work, and Politics provides an important overview, this categorization may be too coarse to capture nuanced social change s, and it remains unclear whether observed changes in female stereotypes accurately reflect real-world gender representation in society. To address these limitations, we also analyze occupational terms, which allow us to directly correlate stereotype shifts with national census data and assess whether stereotype change reflects actual societal change. Thus, we pose RQ3 as follows: RQ3. How do female stereotypes of occupations change between 1900 and 1999, and to what extent do these changes reflect actual shifts observed in national census data?

4 Data and method

4.1 Corpus

The Japanese National Diet Library released the NDL Ngram Data 444https://github.com/ndl-lab/ndlngramdata in 2022. The dataset contains approximately 890 million unique n-gram extracted from the text of over 970,000 digitized books in Japan, comprising nearly all books published in Japan since the Meiji era (around 1868). It includes uni-grams to five-grams along with their yearly frequencies. N-grams that appear fewer than four times through the whole periods were excluded during corpus construction. Although the dataset aims to include all books and magazines published in Japan, it only contains works whose copyright protection has expired.

Figure 1 presents the total number of n-grams per year. We restricted our analysis to the period between 1900 and 1999 for three reasons: (1) the relatively large volume of available data, (2) consistency with previous studies that quantify stereotypes over a 100-year span [4, 6, 15], and (3) sufficient temporal coverage both before and after the U.S. occupation of Japan. Notably, the total number of n-grams in 1945 drops sharply, likely reflecting the disruptions caused by World War II and postwar censorship under U.S. occupation. The sharp decline in n-grams beginning around 1970 is the result of copyright limitations on which works could be digitized.

Refer to caption
Figure 1: The total number of n-grams in each year in the NDL Ngram Data

4.2 Preprocessing and model training

Before training the word embeddings, we first performed word segmentation on the bigram to five-gram data. Word segmentation is a necessary preprocessing step for Japanese natural language processing (NLP), as Japanese is written without spaces between words. We used MeCab [25] in combination with a dictionary designed for historical Japanese texts [26] 555https://clrd.ninjal.ac.jp/unidic/back_number.html for this task. As part of preprocessing, we also removed n-grams composed solely of alphabets, numbers, or postpositional particles, as they carry limited semantic content in Japanese.

We trained a separate word embedding model for each year from 1900 to 1999 using the skip-gram with negative sampling (SGNS) algorithm [27]. SGNS aims to maximize the probability of observed word–context pairs in the corpus while minimizing the probability of randomly sampled, unobserved pairs (i.e., negative samples). We used SGNS of its widespread adoption in prior research [4, 6, 5, 15, 10]. Each model was trained with a window size of 4, negative sampling of 10, and a vector size of 300. The window size was set to 4 because the corpus includes only up to five-gram sequences. Negative sampling was set to 10 to accelerate the training process. A vector size of 300 was selected in accordance with commonly used pre-trained historical word embeddings [28]. For the implementation, we use the implementation by Zhao et al. [29]. Their implementation provides a Python wrapper for the original C++ implementation, allowing us to train a standard Skip-gram with Negative Sampling (SGNS) word embedding model 666https://github.com/zhezhaoa/ngram2vec.

4.3 Quantification of gender stereotype

To calculate gender stereotypes, we relied on a widely used bias measurement method called the Word Embedding Association Test (WEAT) proposed by Caliskan et al. [3]. The WEAT defines bias in a word embedding model as the mean difference between the mean cosine similarity of two lists of attribute words to a target word in a specific domain.

S(X)=meanxXs(x,A,B)\displaystyle S\left(X\right)=\underset{x\in X}{\text{mean}}\,s\left(x,A,B\right) (1)

where

s(x,A,B)\displaystyle s\left(x,A,B\right) =meanaAcos(vx,va)\displaystyle=\underset{a\in A}{\text{mean}}\,\text{cos}\left(v_{x},v_{a}\right) (2)
meanbBcos(vx,vb)\displaystyle\quad-\underset{b\in B}{\text{mean}}\,\text{cos}\left(v_{x},v_{b}\right) (3)

viv_{i} is a vector of word ii and cos(vi,vj)\text{cos}\left(v_{i},v_{j}\right) is a cosine similarity between viv_{i} and vjv_{j}. xx is a target word in list XX. AA and BB are lists of attribute words.

In our study, target words are occupational words or words in the Home, Work and Politics domain, and two lists of attribute words AA and BB are Japanese gender words for women and men, respectively. More precisely. the list AA consists of female-related words (e.g., 女(woman), 女性(female), …), while the list BB consists of male-related words (e.g., 男(man), 男性(male), …). A positive WEAT score indicates that the target word or the domain is female-stereotyped, whereas a negative score indicates that target word or the domain is male-stereotyped.

When calculating stereotype value for each year within Home, Work, and Politics domain, we employ a bootstrap procedure that resamples words according to the original size of the domain. This process is crucial because it quantifies the uncertainty of stereotype strength estimates within each domain, especially when domain sizes are small and individual words have a large influence. In each iteration, words are randomly drawn with replacement to match the domain size, and the stereotype value is recalculated using equation (1). This entire process is repeated 1,000 times for each year, producing a distribution of scores that reflects both the uncertainty and stability of the domain-specific results. From this distribution, we derive 95% confidence intervals around our estimates.

4.4 Construction of word lists for quantifying gender stereotypes

We constructed several lists of words for this analysis. First, we collected words representing the Home, Work, and Politics domains. We also collected occupational and gender-related words.

To collect words representing Home and Work domains, we draw on the J-LIWC2015 dictionary [30], the latest Japanese version of the LIWC dictionary which is based on LIWC2015 (Linguistic Inquiry and Word Count 2015) dictionary [31]. The LIWC dictionary is one of the most widely used and validated dictionaries, undergoing multiple reliability and validity assessments [32]. The LIWC dictionary categorizes words into various categories such as linguistic dimension and psychological construct.

The J-LIWC2015 dictionary follows the structure of the original LIWC2015 and was carefully developed through qualitative evaluations by psychologists, internal consistency checks of the categories, and human validation. We collected words for the Home and Work domains from “home” and “work” category in the J-LIWC2015 dictionary. The J-LIWC2015 dictionary contains 148 words in the “home” category and 837 words in the “work” category. Since there is no Political category in the J-LIWC2015 dictionary, we use Empath which is an open-source and validated dictionary [33]. We first extracted all the English words in Political category and manually translated them into Japanese.

We removed words from each domain that were not present in all 100 word embedding models. Importantly, we manually excluded gendered words within the “home” category such as 主婦 (housewife) and 主人 (husband) as they themselves reflect gender bias. We did not find any gendered words for Work and Politics domains. The final list includes 85 words for Home, 551 words for Work, and 78 words for Politics domains. Table 1 shows the lists of example words in each domain.

We also used the J-LIWC2015 dictionary to collect gender word pairs by referring to “Female references” and “Male references” categories. As with the selection of words for the Home and Work domains, words that do not exist in all of the 100 word embedding models were excluded from the lists. The final lists contain 20 gender word pairs. Table 1 shows the lists of example gender words.

Home words Work words Politics words Male words Female words
家(home/house), 台所 (kitchen), 園芸 (gardening), 掃除 (cleaning), 近所 (neighborhood) 仕事(work), マーケット (market), 顧客 (client), 上司 (boss), 交渉 (negotiation) 国家(nation), 知事 (governor), 政治家 (politician), 選挙 (election), 活動家 (activist) 彼(he), 男 (man), 男性 (male), 男児 (boy), 男の子 (boy) 彼女(she), 女 (woman), 女性 (female), 女児 (girl), 女の子 (girl)
Table 1: The lists of example male, female, home, work, and politics words.

For occupations, we first collected occupations from the Japan Standard Occupational Classification (1960) 777https://www.soumu.go.jp/toukei_toukatsu/index/seido/shokgyou/02toukatsu01_03000024.html. To ensure robustness for our analysis, we applied two selection criteria to filter out some occupations from the original list. First, occupations had to consistently appear across all 100 embedding models. Second, they needed to have at least one data point in the Japanese national census 888https://www.e-stat.go.jp/stat-search/files?page=1&toukei=00200521. Since the census is conducted every five years and gender-disaggregated statistics are only available after 1940, with the statistics for 1985, 1990, and 1995 missing, the maximum number of data points available for any occupation was nine. Table 2 lists the 18 occupations that satisfied both criteria.

Occupational words
医師(doctor), 歯科医師 (dentist), 画家 (painter), 音楽家 (musician), 俳優 (actor), 弁護士 (lawyer), 記者 (journalist), 駅長 (stationmaster), 船長 (ship captain), 車掌 (train conductor), 船頭 (boatman), 木工 (woodworker), 石工 (stonemason), 大工 (carpenter), 左官 (plasterer), 土工 (construction worker), 芸者 (Geisha), 料理人 (cook)
Table 2: The lists of occupational words.

5 Results

This section presents the main findings from our analysis.

5.1 Gender stereotype change in Home, Work, Politics

5.1.1 RQ1. How do female stereotype of the Home, Work, and Politics domains change between 1900 and 1999?

Figure 2 shows the changes in gender stereotypes in the Home, Work, and Politics domains. Words across all three domains show a rise in female stereotypes beginning around 1945. This pattern partially aligns with our expectations: as women’s status in the workplace and politics improved after 1945, female stereotypes in the Work and Politics domains also increased. Both domains show very similar trajectory: they show an increase in female streotype after around 1945 and become female-stereotyped after 1970, while they were mostly male-stereotyped until then. However, the results partially diverge from our expectation. While we anticipated a decrease in female stereotypes in the Home domain as women gained greater participation in the workplace, the Home domain instead exhibits the opposite trend and remained female-stereotyped throughout the period. Following the work of Jones et al. [5], Charlesworth et al. [7], we also calculated the trend coefficient for each domain as an estimated per-year change in female stereotypes using linear regression to quantify the overall trend (Table 3). The trend coefficients for all three domains are positive with p-values <0.01<0.01, indicating that female stereotype values in these domains increased over time. The Home domain shows the highest per-year change, while Work domain shows the lowest.

Refer to caption
Figure 2: The changes of female stereotypes of Home, Work, and Politics over the centry. The red vertical line corresponds to 1945, the year World War II ended. We also include shaded regions representing the 95% confidence intervals around the stereotype estimates for each domain. However, the intervals are not visible in the plots because their width is negligible compared to the magnitude of the overall stereotype change.
Domain Trend Coefficient Standard Error
Home 0.000128* 1.42 ×105\times 10^{-5}
Work 0.000113* 2.02 ×105\times 10^{-5}
Politics 0.000119* 2.38 ×105\times 10^{-5}
Table 3: Estimated per-year change in female stereotypes for Home, Work, and Politics domains. Trend coefficients with p<0.01p<0.01 are marked with *.

5.1.2 RQ2. Does the female stereotype of the Home domain remain higher than that of the Work and Politics domains across the years?

Consistent with our expectations, the Home domain exhibits the strongest female stereotypes among the three domains throughout the entire time period, whereas the Work and Politics domains display weaker and largely similar levels of female stereotypes. Although stereotype values fluctuate over time for all domains, the persistently higher values in the Home domain likely reflect the long-standing cultural association of women with domestic roles and men with the workforce or political spheres.

5.2 Gender stereotype change in Occupations

5.2.1 RQ3. How do female stereotypes of occupations change between 1900 and 1999, and to what extent do these changes reflect actual shifts observed in national census data?

Figure 3 shows the changes in average female stereotypes across all 18 occupations. To enhance clarity, we display only the overall average in the figure; for the stereotype changes of individual occupations, please refer to Section B.

Refer to caption
Figure 3: The change of female stereotypes averaged across all 18 occupations over the centry. The red vertical line in the figure corresponds to 1945.

Compared to the three domains, the female stereotype values for occupational words fluctuate across the time period. One possible factor contributing to this is the size of the word list. Since the word lists of the three domains are sufficiently large, the estimated stereotypes are robust to individual outlier words, due to the averaging effect inherent in large word lists. Another possible explanation is the low frequency of occupational terms. In the (Appendix C), we computed the average frequency of each word across all years (1900–1999) in each domain, and then averaged these values across all words. This yielded a yearly per-word average frequency for each domain. We find that the frequency across the 18 occupations is lower than that of Home, Work, and Politics. This may have contributed the instability of the stereotype values.

Overall, the average female stereotype score for occupational words was negative for most years, indicating a prevailing male bias over the centry. Still, the average female stereotype on occupational words show statistically significant increase (Table 4), which is steeper than that of Home, Work, and Politics domains. This suggests that occupations on average becomes more female stereotyped over the periods.

To examine the relationship between occupational gender representation and gender stereotypes derived from word embeddings, we conducted a correlation analysis between the proportion of women and the corresponding female stereotype values. First, we calculated the overall correlation by pooling all occupations and years, correlating the proportion of women in each occupation-year with its female stereotype value, and computing the Pearson correlation coefficient (rr) along with its statistical significance (pp) (Figure 4). This provided a general measure of the association between women’s representation and stereotype strength across the entire dataset. Across all occupations and years, we observed a moderate positive correlation between the proportion of women and the female stereotype value, which was statistically significant (r=0.413r=0.413, p=0.00025p=0.00025). These results suggest that, overall, higher female representation within an occupation is associated with stronger alignment with female stereotypes.

Refer to caption
Figure 4: The overall correlation between the proportion of women in each occupation-year and the corresponding female stereotype value.

We then performed the same analysis separately for each occupation. Since some occupations have relatively few data points in the national census, the results for these groups are less reliable and often did not reach statistical significance. Consequently, only a subset of occupations with larger sample sizes showed statistically significant correlations, while many others exhibited strong but non-significant trends due to limited data. More specifically, three occupations showed statistically significant correlations: 医師 (doctor) (r=0.861r=0.861, p=0.00287p=0.00287), 歯科医師 (dentist) (r=0.817r=0.817, p=0.00717p=0.00717), and 左官 (plasterer) (r=0.667r=0.667, p=0.0496p=0.0496). In all cases, the correlations were positive, with particularly strong associations for doctors and dentists and demonstrates the obtained word embedding models reflects real-world occupational gender representation to a certain degree.

Domain Trend Coefficient Standard Error
Average (occupations) 0.000153* 0.0000393
Table 4: Estimated per-year change in female stereotypes for selected occupations and averages. Trend coefficients with p<0.01p<0.01 are marked with *.

5.3 Gender stereotype change against the corpus-wide trend

In Section 5.1, we find that even the Home domain shows an increase in female stereotypes after around 1945. Since this could potentially be influenced by a global trend in the corpus, we first calculated the average gender stereotype across all words that appeared consistently in all 100 of our word embedding models (74,004 words) to examine whether any corpus-level trend exists.

We find that the average female stereotype across all words also exhibits an increase trend starting around 1966 (Figure 5), with statistically significant increase (Table 5). This trend is similar to that of the three domains, specifically closely resembling the patterns observed in the Work and Politics domains. However, the trend coefficient of the average of all the words is steeper than that of the three domains.

Refer to caption
Figure 5: The changes of the averaged female stereotypes of all the words that appeared consistently across all 100 of our word embedding models. The red vertical line in the figure corresponds to 1945.
Domain Trend Coefficient Standard Error
Average (all words) 0.000137* 0.0000224
Table 5: Estimated per-year change in female stereotypes for averages. Trend coefficients with p<0.01p<0.01 are marked with *.

Since we identified this corpus-level trend, it is important to distinguish whether any observed changes in gender stereotypes are unique to words or domains of interest, or if they simply reflect a broader shift across the entire language.

To eliminate the effect of the corpus-level trend, we calculate adjusted score Sadjusted(X)S_{adjusted}(X) for each year, defined as

Sadjusted(X)\displaystyle S_{\text{adjusted}}(X) =S(X)SAverage\displaystyle=S(X)-S_{\text{Average}} (4)

Sadjusted(X)S_{adjusted}(X) is the adjusted score for category XX, and SAverageS_{Average} represents the yearly average gender stereotype across all the words. This adjustment enables us to assess (1) whether a word or domain is more or less female-stereotyped compared to the overall average in a given year, and (2) whether its trajectory of change deviates from the broader societal trend.

Figure 6 shows the changes in adjusted female stereotypes for Home, Work, and Politics domains from 1900 to 1999. We find that the adjusted female stereotype is consistently positive for Home domain and negative for Work and Politics domain across the most of the time frame. This indicates that Home is more female-stereotyped, whereas Work and Politics are more male-stereotyped for most of the years, compared to the overall average.

The adjusted female stereotype for the Home domain shows a steady increase until around 1960, followed by a continuous decline, reaching nearly zero by 1983. This suggests that words in Home domain increasingly deviate from the overall trend up to 1960, but have converged toward the average since then. In contrast, Work and Politics domain show similar fluctuations but their adjusted values mostly stay in the range of 0.00.0 to 0.01-0.01. This suggests that words in Work and Politics domains change mostly at a similar rate as the overall corpus-wide average.

Since the Home, Work, and Politics domains’ raw stereotype score shows a positive trend coefficient in Section 5.1, it means the three domains themselves are becoming more female-stereotyped over time. However, their adjusted stereotype score shows a negative trend coefficient (Table 6), this indicates that the three domains’ increase is slower than the overall corpus-wide trend. In other words, while all three domains are shifting toward female stereotypes, it is doing so less strongly than the general societal trend, meaning it is becoming relatively less female-stereotyped compared to the average across all words. However, only the Work domain shows a negative trend coefficient which is statistically significant.

Refer to caption
Figure 6: The changes of adjusted female stereotypes of Home, Work, Politics over the centry. The red vertical line in the figure corresponds to 1945. We also include shaded regions representing the 95% confidence intervals around the stereotype estimates for Home, Work, Politics domains. However, the intervals are not visible in the plots because their width is negligible compared to the magnitude of the overall stereotype change.
Domain Trend Coefficient Standard Error
Home (adjusted) -0.00000838 0.0000176
Work (adjusted) -0.0000243\dagger 0.0000107
Politics (adjusted) -0.0000181 0.0000146
Table 6: Estimated per-year change in female stereotypes for adjusted domains. Trend coefficients with p<0.01p<0.01 are marked with *, trend coefficients with 0.01p<0.050.01\leq p<0.05 are marked with \dagger.

For occupations, we find that the adjusted female stereotype is consistently negative, meaning that the 18 occupations are more male-stereotyped on average, compared to the overall average. The adjusted female stereotype for occupations exhibits a positive trend coefficient value overall, indicating its female stereotype increase is faster than that of the average of all the words. However, adjusted female stereotype for occupations show large fluctuations and the trend coefficient value was not statistically significant.

Refer to caption
Figure 7: The change of adjusted female stereotypes averaged across all 18 occupations over the centry. The red vertical line in the figure corresponds to 1945.
Domain Trend Coefficient Standard Error
Average for occupations (adjusted) 0.00000164 0.0000276
Table 7: Estimated per-year change in female stereotypes for adjusted average for occupations. Trend coefficients with p<0.01p<0.01 are marked with *, trend coefficients with 0.01p<0.050.01\leq p<0.05 are marked with †.

5.3.1 Corpus-wide trend of increasing female stereotype?

What does this corpus-wide trend indicate? One possible explanation is that broader societal shifts in gender roles are reflected not only in specific domains which were traditionally gender-stereotyped, but also in general language usage across a wide range of words. In the early 1900s, female terms were only associated with a few specific domain of words, confining their semantic influence to a relatively narrow region of the embedding space. However, as the status of women improves in Japanese society, female terms began to appear in more diverse contexts, allowing them to exert influence across a wider portion of the lexicon. Because word embeddings capture patterns of co-occurrence rather than raw frequency, these distributed shifts in associations can result in the average bias across the entire vocabulary becoming female-leaning. However, it is also important to note that this phenomenon could be influenced by other external factors, such as historical changes in Japanese writing, which we discuss in details in Section A

6 Discussion and conclusion

The analysis on Home, Work, and Politics domains revealed that all three domains show an increase in female stereotypes, with Work and Politics shifting from male- to female-stereotyped by the 1970s, while the Home domain remained consistently female-stereotyped. We also found that the female stereotype for the Home domain is consistently stronger than the other two domains throughout the period. This may reflect how women’s growing participation in public life was acknowledged in language, but without weakening their association with domestic roles. This suggests that women came to be seen as both homemakers and workers or politicians, rather than one role replacing the other. This aligns with the reforms introduced under the US-led Allied Occupation, such as suffrage, constitutional guarantees of gender equality, and labor protections, which created new opportunities for women in politics and the workforce. Yet, just as the persistence of strong Home associations in language reflects persistant cultural norms, Japanese society continued to reinforce traditional gender expectations in the household, meaning women’s expanded roles in public life did not erase their enduring ties to domestic responsibilities.

The corpus-wide trend also further reveals that while stereotypes in Home, Work, and Politics domains became more female-associated over time, they actually lagged behind the faster, society-wide linguistic shift toward stronger female associations. This is consistent with Japan’s experience of postwar reforms, where top-down legal changes significantly altered the framework for women’s rights but had limited impact on firmly established workplace and political inequalities. Women remained in lower-paying occupations, underrepresented in leadership and politics, and subject to high attrition during child-rearing years, despite formal guarantees of equality. Just as domain-specific stereotypes resisted change relative to the corpus-wide trend, structural and cultural barriers in Japan slowed the translation of legal reforms into lived gender equality.

Our analysis on occupations also provides several key insights. First, when averaging across all the 18 occupations, we found that the female stereotype values were below zero for most of the century, indicating that occupations were generally male-stereotyped. This pattern is further supported by the adjusted female stereotype measure, which was consistently negative. Second, while the average femle stereotype of the 18 occupations shows some fluctuations most likely due to the lower frequency of occupational terms and the smaller size of the occupation list, it shows a statistically significant upward trend, suggesting that occupational terms in language gradually shifted toward stronger associations with women over time. This pattern likely reflects broader societal changes in women’s participation in the labor market. Finally, the correlation analysis demonstrates that, when pooling across all occupations and years, occupations with higher female representation were more likely to be associated with stronger female stereotypes in language. This supports the idea that word embedding–based stereotype measures capture, at least to some extent, real-world demographic shifts. The fact that three occupations (doctor, dentist, and plasterer) showed significant positive correlations further reinforces this interpretation. Particularly for doctors and dentists, the strong correlations suggest that the increase of women in these professions was mirrored in linguistic representations.

Taken together, our findings show that language-based gender stereotypes gradually shifted to reflect women’s growing participation in public and professional life. However, the Home domain remained consistently female-stereotyped, and most occupations were still male-stereotyped. Furthermore, occupations with higher female representation such as doctors and dentists showed positive correlations with female stereotypes, indicating that linguistic representations partially mirrored real-world demographic changes despite persistent cultural and structural barriers.

7 Limitations

We acknowledge several limitations in our study. First, our training corpus may only reflect limited perspectives in gender norms. The Japanese book publishing industry, particularly in earlier decades, was likely dominated by male and highly educated authors. This imbalance may have influenced how men and women are portrayed in books and magazines, shaping the gendered language patterns captured in our analysis. Second, our results may be heavily influenced by the specific word lists used to represent the concepts of home, work, male, and female [34, 35]. Nevertheless, we constructed the word lists based on a carefully validated dictionary and further refining them by removing gendered terms from the home and work domains. Third, since the corpus includes only up to five-grams, the maximum window size we could use for training was limited to four, which may have affected the results. Finally, many occupations used in the analysis had relatively few data points in the national census, making it difficult to detect statistically reliable correlations. The smaller word list for occupations, combined with the lower frequency of occupational terms, also introduces volatility in stereotype values.

Appendix A The explanations for the corpus-wide trend of female stereotype values

While the corpus-wide trend of female stereotype values reported in Section 5.3 is likely driven by societal changes in gender norms, it may also be partly confounded by historical shifts in Japanese writing.

To investigate this, we calculated the correlation between the relative frequency difference of gender words for each year (total relative frequency of female words minus male words) and the average stereotype values of all words for each year, as reported in Section 5.3 (Figure 8). The results show a strong correlation (r=0.60r=0.60, p<0.01p<0.01): as the frequency difference decreases, the average female stereotype value tends to increase. While this analysis is correlational, the findings suggest that word frequency differences in gender terms may have contributed to the observed corpus-wide trend in female stereotype values.

Refer to caption
Figure 8: The correlation between the relative frequency difference of gender words (total relative frequency of female words - male words for each year) and the average stereotype values for each year.

We next examined the relative frequency changes of individual gender words (Figure 9). Interestingly, a single word for each gender exhibits substantially higher frequency than the others: 彼 (he) among male words and 女 (woman) among female words. This suggests that overall frequency patterns for gender terms are disproportionately shaped by these high-frequency words. While both 彼 (he) and 女 (woman) exhibit relatively similar trends, the sharper rise in 彼 (he) until around 1950 and its subsequent decline are particularly pronounced. This pattern indicates that the fluctuations of 彼 (he) likely played a central role in widening and narrowing the gender frequency gap over the decades.

Refer to caption Male Words
Refer to caption Female Words
Figure 9: The relative frequency change of each male and female word over the centry.

The next question is: why does 彼 (he) show this trend? One plausible explanation lies in the historical context of Japanese writing. In the early 1900s, Western-influenced writing styles became popular, as many believed that modernizing the Japanese language was essential for importing Western science and technology. Since third-person pronouns like 彼 (he) did not originally exist in Japanese, their absence was often perceived as a deficiency or sign of backwardness in the language. As a result, many writers and translators adopted Western-style conventions in their work [36, 37], which likely contributed to the sharp rise in the use of 彼 (he). However, these Western-influenced styles did not fully align with traditional Japanese norms [37]. For example, Japanese is a “pro-drop” language, where speakers commonly omit grammatical arguments such as subjects, objects, or pronouns, including third-person pronouns, when they are clear from context [38]. Such writing often appeared unnatural to native readers and writers, which likely contributed to the subsequent decline in the explicit use of third-person pronouns like 彼.

In summary, while the rise in female stereotype values likely reflects societal changes, it also appears partly shaped by historical shifts in Japanese writing. In particular, the sharp rise and later decline of 彼 (he), driven by the early 20th-century adoption and later rejection of Western-influenced writing styles, may have influenced female stereotype values calcuated through the embedding models.

Appendix B The female stereotype change of individual occupations

Figure 10 shows female stereotype change for each occupation.

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 10: The changes of female stereotypes for each occupation over the centry.

Appendix C Yearly per-word average frequencies across Home, Work, Politics, Occupation domains.

We first calculated the mean frequency of each word across all years (1900–1999) within each domain. These word-level means were then averaged across all words to obtain a single overall average frequency for each domain. This value can be interpreted as the typical yearly per-word average frequency within each domain. Table 8 shows these yearly per-word average frequency, indicating that it is lower in the occupations than that observed in the Home, Work, and Politics domains.

Home Work Politics Occupation
Yearly per-word average frequency 35,790 76,516 98,447 17,878
Table 8: Yearly per-word average frequencies across Home, Work, Politics, Occupation domains.

Appendix D The quality of word embedding models

We also evaluate the quality of the word embedding models trained for each year. For this, we use JWSAN, a large-scale Japanese dataset designed to evaluate distributional semantic models in terms of word similarity and word association [39]. Word similarity captures semantic closeness in meaning (e.g., synonyms), whereas word association reflects conceptual or functional relatedness.

Figures 11(a) and 11(c) show the yearly changes in similarity and association scores. On average, the similarity score is 0.43 and the association score is 0.47. We also find that yearly similarity and association scores are strongly correlated with the vocabulary size of the corpus (Figures 11(b) and 11(d)), suggesting that vocabulary size had a substantial impact on the quality of the yearly word embedding models.

Refer to caption
(a) The similarity score across the years
Refer to caption
(b) The correlation between similarity score and vocaburarly size for each year
Refer to caption
(c) The association score across the year
Refer to caption
(d) The correlation between association score and vocaburarly size for each year
Figure 11: The quality of eacy year’s word embeddding models measured by similarity and association scores, and also their correlation with vocaburary size.

References

  • \bibcommenthead
  • Matsui and Ferrara [2024] Matsui, A., Ferrara, E.: Word embedding for social sciences: An interdisciplinary survey. PeerJ Computer Science 10, 2562 (2024)
  • Bolukbasi et al. [2016] Bolukbasi, T., Chang, K.-W., Zou, J.Y., Saligrama, V., Kalai, A.T.: Man is to computer programmer as woman is to homemaker? debiasing word embeddings. Advances in neural information processing systems 29 (2016)
  • Caliskan et al. [2017] Caliskan, A., Bryson, J.J., Narayanan, A.: Semantics derived automatically from language corpora contain human-like biases. Science 356(6334), 183–186 (2017)
  • Garg et al. [2018] Garg, N., Schiebinger, L., Jurafsky, D., Zou, J.: Word embeddings quantify 100 years of gender and ethnic stereotypes. Proceedings of the National Academy of Sciences 115(16), 3635–3644 (2018)
  • Jones et al. [2020] Jones, J.J., Amin, M.R., Kim, J., Skiena, S.: Stereotypical gender associations in language have decreased over time. Sociological Science 7, 1–35 (2020)
  • Kozlowski et al. [2019] Kozlowski, A.C., Taddy, M., Evans, J.A.: The geometry of culture: Analyzing the meanings of class through word embeddings. American Sociological Review 84(5), 905–949 (2019)
  • Charlesworth et al. [2022] Charlesworth, T.E., Caliskan, A., Banaji, M.R.: Historical representations of social groups across 200 years of word embeddings from google books. Proceedings of the National Academy of Sciences 119(28), 2121798119 (2022)
  • Bhatia [2017] Bhatia, S.: The semantic representation of prejudice and stereotypes. Cognition 164, 46–60 (2017)
  • Lewis and Lupyan [2020] Lewis, M., Lupyan, G.: Gender stereotypes are reflected in the distributional structure of 25 languages. Nature human behaviour 4(10), 1021–1028 (2020)
  • Charlesworth et al. [2021] Charlesworth, T.E., Yang, V., Mann, T.C., Kurdi, B., Banaji, M.R.: Gender stereotypes in natural language: Word embeddings show robust consistency across child and adult language corpora of more than 65 million words. Psychological Science 32(2), 218–240 (2021)
  • Katz and Braly [1933] Katz, D., Braly, K.: Racial stereotypes of one hundred college students. The Journal of Abnormal and Social Psychology 28(3), 280 (1933)
  • Gilbert [1951] Gilbert, G.M.: Stereotype persistence and change among college students. The Journal of Abnormal and Social Psychology 46(2), 245 (1951)
  • Greenwald et al. [1998] Greenwald, A.G., McGhee, D.E., Schwartz, J.L.: Measuring individual differences in implicit cognition: the implicit association test. Journal of personality and social psychology 74(6), 1464 (1998)
  • Bergsieker et al. [2012] Bergsieker, H.B., Leslie, L.M., Constantine, V.S., Fiske, S.T.: Stereotyping by omission: eliminate the negative, accentuate the positive. Journal of personality and social psychology 102(6), 1214 (2012)
  • Bhatia and Bhatia [2021] Bhatia, N., Bhatia, S.: Changes in gender stereotypes over time: A computational analysis. Psychology of Women Quarterly 45(1), 106–125 (2021)
  • Pharr [1981] Pharr, S.J.: Political women in japan: The search for a place in political life. Univ of California Press (1981)
  • Molony [2018] Molony, B.: Feminism in japan. In: Oxford Research Encyclopedia of Asian History, (2018)
  • Okuyama [2021] Okuyama, Y.: Empowering women through radio: Evidence from occupied japan. Unpublished, Uppsala University (2021)
  • van den Berg [2022] Berg, P.: Equality of men and women in article 24 of the japanese constitution (1947): The role of beate sirota (1923-2012) and beyond. Osaka University Law Review 69(1), 23–49 (2022)
  • Garon [1998] Garon, S.: Molding japanese minds: The state in everyday life. Princeton University Press (1998)
  • Yamaguchi [2019] Yamaguchi, K.: Impediments to the advancement of women in the japanese employment system: Theoretical overview and the purpose of this book. Gender Inequalities in the Japanese Workplace and Employment: Theories and Empirical Evidence, 1–45 (2019)
  • Yamaguchi [2016] Yamaguchi, K.: Determinants of the gender gap in the proportion of managers among white-collar regular workers in japan. Japan Labor Review 13(3), 7–31 (2016)
  • Yamaguchi [2019] Yamaguchi, K.: Causes and effects of gender occupational segregation: Overlooked obstacles to gender equality. In: Gender Inequalities in the Japanese Workplace and Employment: Theories and Empirical Evidence, pp. 83–110. Springer, ??? (2019)
  • Eto [2020] Eto, M.: Women and political inequality in japan: Gender imbalanced democracy. Routledge (2020)
  • Kudo [2005] Kudo, T.: Mecab: Yet another part-of-speech and morphological analyzer. http://mecab. sourceforge. net/ (2005)
  • Ogiso et al. [2013] Ogiso, T., Komachi, M., Matsumoto, Y.: Morphological analysis of historical japanese text. Journal of Natural Language Processing 20(5), 727–748 (2013)
  • Mikolov et al. [2013] Mikolov, T., Yih, W.-t., Zweig, G.: Linguistic regularities in continuous space word representations. In: Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 746–751 (2013)
  • Hamilton et al. [2016] Hamilton, W.L., Leskovec, J., Jurafsky, D.: Diachronic word embeddings reveal statistical laws of semantic change. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1489–1501 (2016)
  • Zhao et al. [2017] Zhao, Z., Liu, T., Li, S., Li, B., Du, X.: Ngram2vec: Learning improved word representations from ngram co-occurrence statistics. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 244–253 (2017)
  • Igarashi et al. [2022] Igarashi, T., Okuda, S., Sasahara, K.: Development of the japanese version of the linguistic inquiry and word count dictionary 2015. Frontiers in psychology 13, 841534 (2022)
  • Pennebaker et al. [2015] Pennebaker, J.W., Boyd, R.L., Jordan, K., Blackburn, K.: The development and psychometric properties of liwc2015 (2015)
  • Tausczik and Pennebaker [2010] Tausczik, Y.R., Pennebaker, J.W.: The psychological meaning of words: Liwc and computerized text analysis methods. Journal of language and social psychology 29(1), 24–54 (2010)
  • Fast et al. [2016] Fast, E., Chen, B., Bernstein, M.S.: Empath: Understanding topic signals in large-scale text. In: Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, pp. 4647–4657 (2016)
  • Ethayarajh et al. [2019] Ethayarajh, K., Duvenaud, D., Hirst, G.: Understanding undesirable word embedding associations. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 1696–1705 (2019)
  • Antoniak and Mimno [2021] Antoniak, M., Mimno, D.: Bad seeds: Evaluating lexical methods for bias measurement. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 1889–1904 (2021)
  • Inoue [2002] Inoue, M.: Gender, language, and modernity: Toward an effective history of japanese women’s language. American Ethnologist 29(2), 392–422 (2002)
  • Saito [2016] Saito, M.: The power of translated literature in japan: The introduction of new expressions through translation in the meiji era (1868–1912). Perspectives 24(3), 417–430 (2016)
  • Ishiyama [2019] Ishiyama, O.: Diachrony of personal pronouns in japanese: A functional and cross-linguistic perspective. John Benjamins Publishing Company 344 (2019)
  • Inohara and Utsumi [2022] Inohara, K., Utsumi, A.: Jwsan: Japanese word similarity and association norm. Language Resources and Evaluation 56(1), 109–137 (2022)
  • Kurdi et al. [2019] Kurdi, B., Mann, T.C., Charlesworth, T.E., Banaji, M.R.: The relationship between implicit intergroup attitudes and beliefs. Proceedings of the National Academy of Sciences 116(13), 5862–5871 (2019)
  • Diekman and Eagly [2000] Diekman, A.B., Eagly, A.H.: Stereotypes as dynamic constructs: Women and men of the past, present, and future. Personality and social psychology bulletin 26(10), 1171–1188 (2000)
  • Donnelly and Twenge [2017] Donnelly, K., Twenge, J.M.: Masculine and feminine traits on the bem sex-role inventory, 1993–2012: A cross-temporal meta-analysis. Sex roles 76, 556–565 (2017)
  • Eagly et al. [2020] Eagly, A.H., Nater, C., Miller, D.I., Kaufmann, M., Sczesny, S.: Gender stereotypes have changed: A cross-temporal meta-analysis of us public opinion polls from 1946 to 2018. American psychologist 75(3), 301 (2020)
  • Zhao et al. [2018a] Zhao, J., Wang, T., Yatskar, M., Ordonez, V., Chang, K.-W.: Gender bias in coreference resolution: Evaluation and debiasing methods. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), pp. 15–20 (2018)
  • Zhao et al. [2018b] Zhao, J., Zhou, Y., Li, Z., Wang, W., Chang, K.-W.: Learning gender-neutral word embeddings. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 4847–4853 (2018)
  • Firth [1957] Firth, J.: A synopsis of linguistic theory, 1930-1955. Studies in linguistic analysis, 10–32 (1957)
  • Karlins et al. [1969] Karlins, M., Coffman, T.L., Walters, G.: On the fading of social stereotypes: studies in three generations of college students. Journal of personality and social psychology 13(1), 1 (1969)
  • Nosek et al. [2009] Nosek, B.A., Smyth, F.L., Sriram, N., Lindner, N.M., Devos, T., Ayala, A., Bar-Anan, Y., Bergh, R., Cai, H., Gonsalkorale, K., et al.: National differences in gender–science stereotypes predict national sex differences in science and math achievement. Proceedings of the National Academy of Sciences 106(26), 10593–10597 (2009)
  • Prates et al. [2020] Prates, M.O., Avelar, P.H., Lamb, L.C.: Assessing gender bias in machine translation: a case study with google translate. Neural Computing and Applications 32, 6363–6381 (2020)
  • Black and van Esch [2020] Black, J.S., Esch, P.: Ai-enabled recruiting: What is it and how should a manager use it? Business Horizons 63(2), 215–226 (2020)
  • Devine and Elliot [1995] Devine, P.G., Elliot, A.J.: Are racial stereotypes really fading? the princeton trilogy revisited. Personality and social psychology bulletin 21(11), 1139–1150 (1995)
  • Madon et al. [2001] Madon, S., Guyll, M., Aboufadel, K., Montiel, E., Smith, A., Palumbo, P., Jussim, L.: Ethnic and national stereotypes: The princeton trilogy revisited and revised. Personality and social psychology bulletin 27(8), 996–1010 (2001)
  • Williams and Best [1977] Williams, J.E., Best, D.L.: Sex stereotypes and trait favorability on the adjective check list. Educational and psychological measurement 37(1), 101–110 (1977)
  • Williams and Best [1990] Williams, J.E., Best, D.L.: Measuring sex stereotypes: A multination study, rev. Sage Publications, Inc (1990)
  • Cherlin and Walters [1981] Cherlin, A., Walters, P.B.: Trends in united states men’s and women’s sex-role attitudes: 1972 to 1978. American Sociological Review, 453–460 (1981)
  • Baron and Banaji [2006] Baron, A.S., Banaji, M.R.: The development of implicit attitudes: Evidence of race evaluations from ages 6 and 10 and adulthood. Psychological science 17(1), 53–58 (2006)
  • Ziegert and Hanges [2005] Ziegert, J.C., Hanges, P.J.: Employment discrimination: the role of implicit attitudes, motivation, and a climate for racial bias. Journal of applied psychology 90(3), 553 (2005)
  • Banse et al. [2001] Banse, R., Seise, J., Zerbes, N.: Implicit attitudes towards homosexuality: Reliability, validity, and controllability of the iat. Zeitschrift für experimentelle Psychologie 48(2), 145–160 (2001)
  • Jellison et al. [2004] Jellison, W.A., McConnell, A.R., Gabriel, S.: Implicit and explicit measures of sexual orientation attitudes: In group preferences and related behaviors and beliefs among gay and straight men. Personality and Social Psychology Bulletin 30(5), 629–642 (2004)
  • Kiefer and Sekaquaptewa [2007] Kiefer, A.K., Sekaquaptewa, D.: Implicit stereotypes and women’s math performance: How implicit gender-math stereotypes influence women’s susceptibility to stereotype threat. Journal of experimental social psychology 43(5), 825–832 (2007)
  • Nosek et al. [2002] Nosek, B.A., Banaji, M.R., Greenwald, A.G.: Harvesting implicit group attitudes and beliefs from a demonstration web site. Group Dynamics: Theory, research, and practice 6(1), 101 (2002)
  • Greenwald et al. [2002] Greenwald, A.G., Banaji, M.R., Rudman, L.A., Farnham, S.D., Nosek, B.A., Mellott, D.S.: A unified theory of implicit attitudes, stereotypes, self-esteem, and self-concept. Psychological review 109(1), 3 (2002)
  • Chen [2023] Chen, Z.: Ethics and discrimination in artificial intelligence-enabled recruitment practices. Humanities and Social Sciences Communications 10(1), 1–12 (2023)
  • Font and Costa-jussà [2019] Font, J.E., Costa-jussà, M.R.: Equalizing gender bias in neural machine translation with word embeddings techniques. In: Proceedings of the First Workshop on Gender Bias in Natural Language Processing, pp. 147–154 (2019)
  • Wang et al. [2020] Wang, T., Lin, X.V., Rajani, N.F., McCann, B., Ordonez, V., Xiong, C.: Double-hard debias: Tailoring word embeddings for gender bias mitigation. arXiv preprint arXiv:2005.00965 (2020)
  • Iwanaga [1998] Iwanaga, K.: Women in japanese politics: A comparative perspective. Stockholm University, Center for Pacific Asia Studies (1998)
  • Gordon [2021] Gordon, A.: A modern history of japan from tokugawa times to the present. Oxford University Press (2021)
  • Ward and Sakamoto [2019] Ward, R.E., Sakamoto, Y.: Democratizing japan: the allied occupation. University of Hawaii Press (2019)
  • Moore and Robinson [2004] Moore, R.A., Robinson, D.L.: Partners for democracy: Crafting the new japanese state under macarthur. Oxford University Press (2004)
  • Michel et al. [2011] Michel, J.-B., Shen, Y.K., Aiden, A.P., Veres, A., Gray, M.K., Team, G.B., Pickett, J.P., Hoiberg, D., Clancy, D., Norvig, P., et al.: Quantitative analysis of culture using millions of digitized books. science 331(6014), 176–182 (2011)
  • Davies [2008] Davies, M.: The corpus of contemporary American English: 450 million words, 1990-present (2008)
  • Obana [2003] Obana, Y.: The use of kare/kanojo in japanese society today. New Zealand Journal of Asian Studies 5, 139–155 (2003)