Research highlights
-
In our meta-analysis (published in European Economic Review), we synthesise a quasi-exhaustive register of correspondence audit experiments in hiring from across the globe published between 2005 and 2020. We show that hiring discrimination is equivalent to or more severe for candidates with disabilities, older age, and less physical attractiveness than for those with distinct racial or ethnic characteristics. Older candidates face more discrimination in Europe than in the United States. We find no clear evidence of a decline in discrimination over time.
-
In an experimental study (published in Computers in Human Behavior: Artificial Humans), I explore whether OpenAI’s GPT-3.5 language model displays ethnic or gender bias in job applicant screening using an audit approach. I show that GPT’s evaluations are influenced by ethnic and gender cues, with some evidence for a gender–ethnicity interaction. The chatbot produces similarly stereotyped output as humans would, e.g., favouring ethnic minorities for jobs with poor working conditions or women for female-dominated occupations, such as the job of clothing seller.
-
In our contextual study of hiring discrimination (published in Labour Economics), we empirically test theoretically linked moderators of discrimination. We observe a distinct role of the organisation in shaping hiring discrimination, with ethnic minorities facing less discrimination when applying to non-profits or large organisations. We also find some evidence that hiring discrimination increases in jobs with greater interaction among colleagues or lower labour shortages.
Selected works
Below is a selection of my research. The full overview is available on my Google Scholar page. More details are also available on my ResearchGate or ORCID pages.
Click the citation text to consult the abstract of a given study.
1. Publications in international peer-reviewed journals
1.1. Discrimination
‘How do employers view applicants with and without children?’ with Morien El Haj1, Axana Dalle, and Stijn Baert (2025). Journal of Marriage and Family. [published version]
This study investigates how parenthood influences employers’ hiring decisions and the underlying signals that drive this discrimination. Prior research has consistently shown a motherhood penalty in hiring, whereas evidence on fatherhood remains less clear. Yet, most studies simplify parenthood into a binary distinction between parents and non-parents, neglecting potential variations based on the number and age of children. Moreover, little research has examined the underlying reasons for these disparate hiring decisions. A state-of-the-art vignette experiment was conducted with 452 real recruiters in Flanders (Belgium). Recruiters evaluated fictitious job applicants, whose parental status varied, on invitation rating and sixteen theoretically relevant stigmas. Mothers received lower invitation ratings than non-mothers, regardless of the number and age of children, which can be understood by a range of negative stigma, including lower flexibility, higher absenteeism risk, higher career break risk and lower willingness to work overtime. For men, a penalty was found when they had many children, especially older children. Compared to fathers with fewer children, those with three children were seen as less ambitious, less flexible, less likely to work overtime and more likely to have experienced recent loss of skills. The study highlights the persistent motherhood penalty and demonstrates that the fatherhood effect depends on the number and age of children.
‘Nothing really matters: Evaluating demand-side moderators of age discrimination in hiring’ with Axana Dalle1 and Stijn Baert (2024). Socio-Economic Review. [published version, IZA version]
As age discrimination hampers the OECD’s ambition to extend the working population, an efficient anti-discrimination policy targeted at the right employers is critical. Therefore, the context in which age discrimination is most prevalent must be identified. In this study, we thoroughly review the current theoretical arguments and empirical findings regarding moderators of age discrimination in different demand-side domains (i.e. decision-maker, vacancy, occupation, organization and sector). Our review demonstrates that the current literature is highly fragmented and often lacks field-experimental evidence, raising concerns about its internal and external validity. To address this gap, we conducted a correspondence experiment and systematically linked the resulting data to external data sources. In so doing, we were able to study the priorly determined demand-side moderators within a single multi-level analysis and simultaneously control multiple correlations between potential moderators and discrimination estimates. Having done so, we found no empirical support for any of these moderators.
‘Understanding ethnic hiring discrimination: A contextual analysis of experimental evidence’ with Axana Dalle, Fanny D’hondt, Pieter-Paul Verhaeghe, and Stijn Baert (2023). Labour Economics, 85. [published version]
Previous research has demonstrated that context matters in understanding unequal treatment in hiring-for example, some studies have illustrated that hiring discrimination is low in large organisations or high in public-facing occupations. Following a review of the recent literature on ethnic hiring discrimination, we identified fourteen plausible moderators (i.e. discrimination correlates) from which we derived an equal number of hypotheses related to taste-based and statistical discrimination theories. We empirically tested these hypotheses through a moderation analysis of data from a correspondence experiment supplemented with occupation, organisation, and sector characteristics. Our empirical approach allowed us to simultaneously evaluate and control the interaction effects of multiple contextual factors with ethnic hiring discrimination. Overall, we find that minority (non-Flemish) candidates receive significantly fewer positive responses to their job applications than majority (Flemish) candidates. In particular, non-Flemish candidates experience significantly less discrimination when applying to not-for-profit organisations or organisations with a large workforce. We also find partial empirical support for the hypotheses that hiring discrimination is high in occupations requiring much interaction between colleagues and in occupations where labour market tightness is low. Future research avenues include evaluating the rationale behind the discrimination correlates mentioned above and testing the replicability of this study’s findings across different institutional contexts, labour markets, and grounds for discrimination.
‘The state of hiring discrimination: A meta-analysis of (almost) all recent correspondence experiments’ with Siel Vermeiren and Stijn Baert (2023). European Economic Review, 151 (lead article). [published version]
Notwithstanding the improved integration of various minority groups in the workforce, unequal treatment in hiring still hinders many individuals’ access to the labour market. To tackle this inaccessibility, it is essential to know which and to what extent minority groups face hiring discrimination. This meta-analysis synthesises a quasi-exhaustive register of correspondence experiments on hiring discrimination published between 2005 and 2020. Using a random-effects model, we computed pooled discrimination ratios concerning ten discrimination grounds upon which unequal treatment in hiring is forbidden by law. Our meta-analysis shows that hiring discrimination against candidates with disabilities, older candidates, and less physically attractive candidates seems equally severe as the unequal treatment of candidates with salient racial or ethnic characteristics. Moreover, hiring discrimination against older applicants is more prominent in Europe than in the United States. Last, while we initially find a significant decrease in ethnic hiring discrimination in (Western) Europe, we find no structural evidence of recent temporal changes in hiring discrimination when controlling for the minority groups considered, at the country level, or based on the various other grounds within the scope of this review.
‘Is labour market discrimination against ethnic minorities better explained by taste or statistics? A systematic review of the empirical evidence’ with Stijn Baert, Abel Ghekiere, Pieter-Paul Verhaeghe, and Eva Derous (2022). Journal of Ethnic and Migration Studies, 48(17). [published version]
To mitigate ethnic labour market discrimination, it is essential to understand its underlying mechanisms because different mechanisms call for different counteracting measures. To this end, we reviewed the recent literature that confronts the theories of taste-based and statistical discrimination against the empirical reality. Whereas the empirical evidence for both mechanisms is generally mixed, (field) experimental research, which predominantly focuses on hiring outcomes, appears to yield proportionately more evidence in favour of taste-based discrimination vis-a-vis statistical discrimination. This finding suggests that the taste-based mechanism may better explain ethnic discrimination in hiring. However, we also observe that the measurement operationalisations of the mechanisms vary substantially between studies and that alternative theoretical interpretations of some of the evidence are plausible. Taken together, additional research efforts, using clear measurement standards and appropriate synthesis methods, are required to solidify the review’s main finding.
‘Loss aversion in taste-based employee discrimination: Evidence from a choice experiment’ with Stijn Baert and Eva Derous (2021). Economics Letters, 208. [published version]
Using a choice experiment, we test whether taste-based employee discrimination against ethnic minorities is susceptible to loss aversion. In line with empirical evidence from previous research, our results indicate that introducing a hypothetical wage penalty for discriminatory choice behaviour lowers discrimination and that higher penalties have a greater effect. Most notably, we find that the propensity to discriminate is significantly lower when this penalty is loss-framed rather than gain-framed. From a policy perspective, it could therefore be more effective to financially penalise taste-based discriminators than to incentivise them not to discriminate.
1.2. GenAI
‘Computer says ’no’: Exploring systemic bias in ChatGPT using an audit approach’ (2024). Computers in Human Behavior: Artificial Humans, 2(1). [published version, arXiv version, GitHub project]
Large language models offer significant potential for increasing labour productivity, such as streamlining personnel selection, but raise concerns about perpetuating systemic biases embedded into their pre-training data. This study explores the potential ethnic and gender bias of ChatGPT, a chatbot producing human-like responses to language tasks, in assessing job applicants. Using the correspondence audit approach from the social sciences, I simulated a CV screening task with 34,560 vacancy-CV combinations where the chatbot had to rate fictitious applicant profiles. Comparing ChatGPT’s ratings of Arab, Asian, Black American, Central African, Dutch, Eastern European, Hispanic, Turkish, and White American male and female applicants, I show that ethnic and gender identity influence the chatbot’s evaluations. Ethnic discrimination is more pronounced than gender discrimination and mainly occurs in jobs with favourable labour conditions or requiring greater language proficiency. In contrast, gender bias emerges in gender-atypical roles. These findings suggest that ChatGPT’s discriminatory output reflects a statistical mechanism echoing societal stereotypes. Policymakers and developers should address systemic bias in language model-driven applications to ensure equitable treatment across demographic groups. Practitioners should practice caution, given the adverse impact these tools can (re)produce, especially in selection decisions involving humans.
1.3. Work regimes
‘Speeding up on the learning curve: The evaluation of telework following a surge in telework experience’ with Eline Moens1, Liam D’hert, and Stijn Baert (2025). Applied Economics Letters. [published version, IZA version]
This research letter adds to the literature on the importance of telework experience in employee evaluation by leveraging the telework experience accumulated during the COVID-19 crisis. We conducted a follow-up survey on the evaluation of telework exactly 3 years after our initial data collection in 2020. We find evidence of a learning curve regarding self-reported i) efficiency in performing tasks, ii) work-life balance, and iii) concentration during work, characterized by a more positive evaluation as telework experience increased.
‘Time Tetris: A longitudinal study on compressed schedules and workplace well-being at IKEA’ with Kristen du Bois1, Stijn Baert, and Eva Derous (2025). BMC Public Health, 25(1). [published version, OSF version]
Compressed schedules, where workers perform longer daily hours to enjoy additional days off, are increasingly promoted as a workplace well-being intervention. Nevertheless, their implications for work-related well-being outcomes, such as recovery from work and burnout risk, are understudied. This gap leaves employers with little evidence on whether and how the arrangement contributes to workplace well-being. IKEA Belgium offered its employees the option to enter compressed schedules in the aftermath of a national labour reform aimed at improving well-being and reducing burnout. We collected data on psychological detachment from work, work-related exhaustion, and burnout risk in four waves before and after implementation. We used mixed-effects growth models to estimate the within-subjects changes in these three domains, and two-way fixed effects models to compare changes with those from a non-treated comparison group. Workers experienced increased psychological detachment from work in compressed schedules, yet we saw no decrease in work-related exhaustion or burnout risk. While between-subjects analyses confirm that the increase in psychological detachment is related to treatment, they also hint that this association may fade out during summer when all workers take more extended breaks from work. While workers in compressed schedules may mentally switch off from work more effectively, this does not translate into decreased burnout risk scores. Consistent with theoretical expectations, policymakers and employers should be cautious in assuming that the arrangements significantly reduce burnout.
1.4. COVID-19
‘The COVID-19 crisis and telework: A research survey on experiences, expectations and hopes’ with Eline Moens1, Philippe Sterkens, Johannes Weytjens, and Stijn Baert (2022). European Journal of Health Economics, 23. [published version]
While a considerable number of employees across the globe are being forced to work from home due to the COVID-19 crisis, it is a guessing game as to how they are experiencing this current surge in telework. Therefore, we examined employee perceptions of telework on various life and career aspects, distinguishing between typical and extended telework during the COVID-19 crisis. To this end, we conducted a state-of-the-art web survey among Flemish employees. Notwithstanding this exceptional time of sudden, obligatory and high-intensity telework, our respondents mainly attribute positive characteristics to telework, such as increased efficiency and a lower risk of burnout. The results also suggest that the overwhelming majority of the surveyed employees believe that telework (85%) and digital conferencing (81%) are here to stay. In contrast, some fear that telework diminishes their promotion opportunities and weakens ties with their colleagues and employer.
‘How do employees think the COVID-19 crisis will affect their careersa’ with Eline Moens, Philippe Sterkens, Johannes Weytjens, and Stijn Baert (2021). PLOS One, 23. [published version]
This study is the first in the world to investigate the expected impact of the COVID-19 crisis on career outcomes and career aspirations. To this end, high-quality survey research with a relevant sample of Flemish (Belgian) employees was conducted. About 21% of them fear losing their jobs due to the crisis-14% are concerned that they will even lose their jobs in the near future. In addition, 26% expect to miss out on promotions that they would have received had the COVID-19 crisis not occurred. This fear of a negative impact is higher in vulnerable groups, such as migrants. In addition, we observe that many respondents believe they will look at the labour market differently and will have different work-related priorities in the future. In this respect, more than half of the respondents indicate that they have attached more importance to working conditions and work-life balance since the COVID-19 crisis.
2. Policy papers
‘Hiring discrimination across vulnerable groups’ with Stijn Baert and Brecht Neyt (2025). IZA World of Labor. [published version, OSF version]
Over the past decades, academics worldwide have conducted experiments with fictitious job applications to measure discrimination in hiring. This discrimination leads to underutilization of labor market potential and higher unemployment rates for individuals from vulnerable groups. Collectively, the insights from the published research suggest that three groups face more discrimination than ethnic minorities: people with disabilities, less physically attractive people, and older people. The discrimination found in Western economies generally persists across countries and is stable over time, although some variation exists.
3. Working papers
‘Ethnocultural identity and hiring decisions: The role of social desirability and employer bias’ with Louise Devos1, Kristen du Bois, and Stijn Baert (2026). [UGent version]
Hiring discrimination against candidates from ethnocultural minority groups is a persistent concern in contemporary labour markets. This study examines how professional recruiters evaluate fictitious job applicants with profiles that systematically vary in signals that form ethnocultural identity rather than isolated minority markers. Using a preregistered factorial survey experiment true to recruiters’ organisational context, we assess how greater perceived distance from the ethnocultural majority is associated with hiring intentions. Structural equation modelling shows that lower perceived ethnocultural alignment is strongly and negatively associated with the likelihood of a candidate being considered for a job interview. This bias is also reflected in the extent to which recruiters identify with a candidate, as well as in taste-based expectations and competence assessments related to communication, efficiency, and leadership. Methodologically, we reinforce the credibility of the experimental findings by explicitly addressing socially desirable responses using three complementary approaches. First, we used a validated scale that captures socially desirable response tendencies, excluding respondents with a strong tendency to such responding. Second, we implemented the nominative technique, reducing the normative pressure to report personal views. Third, we employed the Bayesian truth serum, weighting responses based on their informativeness and honesty. Across all specifications, perceived alignment with the ethnocultural majority emerges as a robust and consistent correlate of hiring intentions.
‘The impact of regional identity on hiring chances: An experiment examining employer bias’ with Louise Devos1, Dagmar Claus, and Stijn Baert (2025). [IZA version]
Regional mobility is crucial for addressing labour shortages, as jobseekers from one region may fill vacancies in another region with few local candidates. However, this requires a willingness amongst employers to consider candidates from across regional borders. This study examines the influence of regional identity on hiring decisions in the Belgian labour market, focusing on perceptions of Flemish recruiters towards Flemish and Walloon candidates. Through a state-of-the-art vignette experiment, genuine Flemish recruiters evaluated fictitious resumes of school leavers that signalled regional identity through their name, place of birth, residential address, secondary school location, and/or language proficiency. Walloon candidates consistently score lower on key hiring metrics. Structural equation modelling reveals that Flemish employers hold negative perceptions of Walloon candidates, particularly regarding availability, interpersonal competency, attitude, and willingness of employers, employees, and clients to cooperate with them. These findings highlight the persistent role of regional identity stereotypes in reinforcing labour market inequalities and impeding mobility as a strategy to mitigate labour market tightness.
‘Too much of a good thinga Telework intensity and workplace experiences’ with Eline Moens1, Kathleen Vangronsvelt, Ans De Vos, and Stijn Baert (2025). (Revise & resubmit after 1st round, Kyklos) [IZA version]
At a time when numerous organisations are urging a return to the office while many employees prefer to continue teleworking, it is crucial to ascertain the optimal level of telework intensity. In the present study, we determine this ideal level with respect to self-rated employee attitudes, behaviour, well-being, social relations and professional growth. Drawing on a five-wave longitudinal dataset, we apply fixed effects regression analyses to investigate associations between telework intensity and various dimensions of workplace experience. We offer more robust empirical evidence for favouring hybrid work schedules over an office-only or telework-only regime owing to significant advances in causal interpretation of linear and non-linear associations compared to the majority of existing studies that examine linear associations based on cross-sectional data. Our results point toward an inverted U-shaped association between telework intensity and self-rated job satisfaction, worklife balance, relationships with colleagues and professional development, with optimal levels peaking around 50% teleworking. For task efficiency and work concentration, the association appears to be concave with a plateau, stabilising at teleworking levels above 70%. Only between telework intensity and employer connectedness do we observe a slightly negative linear association.
‘Not a lucky breaka Why and when a career hiatus hijacks hiring chances’ with Liam D’hert1 and Stijn Baert (2024). (Revise & resubmit after 1st round, Labour Economics) [IZA version]
Sustaining social security systems amidst an ageing population requires (re)integrating the unemployed and inactive into work. However, stigma surrounding non-employment history can create barriers to finding a job. Whilst unemployment stigma is well-documented, inactivity stigma remains under the radar. To address whether, why, and when inactivity hinders hiring, we employed a vignette experiment where real-life recruiters rated fictitious applicants with varying non-employment breaks on hireability and productivity. Results reveal employers rank candidates by their reason for being out of work: those with training breaks rank highest, followed by former caregivers, the previously ill and the unemployed, and last, the discouraged. Productivity perceptions match this pattern. Trainees score highest for skills, motivation, cognition, discipline, reliability, flexibility, and trainability. Caregivers excel in perceived social skills but fall short on flexibility. The previously ill are seen as more motivated than the unemployed but likely raise health concerns. The discouraged trigger the harshest stigma, particularly for motivation and self-discipline. Longer lapses hurt hiring chances, but not for training breaks.
‘Fertility, pregnancy, and parenthood discrimination in the labour market: A systematic review’ with Morien El Haj1, Elsy Verhofstadt, Luc Van Ootegem, and Stijn Baert (2024). [IZA version]
Disparities in labour market outcomes between parents and non-parents arise partly from discriminatory practices. Understanding these unfair practices is essential for fostering workplace equity. Our systematic review of the literature summarises employer discrimination based on various manifestations of parenthood in multiple labour market outcomes. Unlike previous studies, our review encompasses not only motherhood but also fatherhood and the stages preceding parenthood, namely fertility and pregnancy. In terms of labour market outcomes, we consider discrimination in hiring, remuneration, promotion, and dismissal. We also focus exclusively on experimental research, enabling causal conclusions about discrimination and its underlying mechanisms. Our synthesis suggests that employers consistently penalise women in the labour market when they have children, during pregnancy, and during their fertile years. In contrast, men often experience no adverse effects or even a premium when they have children. Researchers frequently find evidence of statistical discrimination as the primary explanation for their findings. Employers appear to rely predominantly on information based on norms and stereotypes to make decisions about parents in the labour market. We offer a roadmap for academics, policymakers, and employers to map and mitigate this phenomenon in the long term. In particular, we highlight fruitful directions for future research, including (i) more broadly assessing the effects of fertility, (ii) more effectively manipulating parenthood in experiments, (iii) more frequently investigating dismissal as a labour market outcome, and (iv) more profoundly examining the mechanisms of parenthood discrimination.
‘Unemployment, inactivity, and hiring chances: A systematic review and meta-analysis’ with Liam D’hert1 and Stijn Baert (2024). (Revise & resubmit after 1st round, Socio-Economic Review) [IZA version, IZA opinion]
Policymakers’ push for higher employment rates requires the activation of long-term unemployed jobseekers and inactive persons. However, stigma related to unemployment or inactivity can hinder their hiring chances when applying for a job. This systematic literature review investigates whether, when, and why periods of not working are penalised in hiring. Our review confirms that employers generally treat the unemployed and inactive less favourably than their employed counterparts. A meta-regression analysis of transnational experimental data points to heterogeneity by the duration of being out of work: short-term unemployment of up to six months positively affects hiring prospects, while the adverse effects of unemployment scarring become noticeable after about twelve months. We highlight evidence for signalling mechanisms underlying this pattern: immediate availability offsets the negative signals in short spells, whereas expectations about reduced productivity primarily drive the negative impact of longer spells. The latter negative signal is more pronounced when unemployment rates are low.
4. Work in progress
‘The cyclicality of hiring discrimination’. [Draft]
I reanalyse over two decades’ worth of correspondence experiments - representing over 1.4 million applications - to test whether hiring discrimination varies over the labour market cycle, for which groups, and where. To strengthen causal identification, I introduce the meta-analytic event study method, which integrates dynamic treatment effect estimation within a meta-regression framework. The results suggest that discrimination rises when unemployment is high. By group, meta-analytic empirical evidence of countercyclicality is consistent for racial and ethnic minorities in Western Europe and for older workers, but mixed for sexual and gender minorities. In contrast, I detect no systematic cyclicality in discrimination for racial and gender minorities in North America. The results reconcile mixed findings from the present empirical literature, often covering single countries, single groups, and restricted timeframes. From a policy perspective, I argue that anti-discrimination efforts are most needed in slack labour markets, when discriminatory hiring typically intensifies.
‘Humans vs GPTs: Bias and validity in hiring decisions’. [OSF draft]
The advent of large language models (LLMs) may reshape hiring in the labour market. This paper investigates how generative pre-trained transformers (GPTs)-i.e. OpenAI’s GPT-3.5, GPT-4, and GPT-4o-can aid hiring decisions. In a direct comparison between humans and GPTs on an identical hiring task, I show that GPTs tend to select candidates more liberally than humans but exhibit less ethnic bias. GPT-4 even slightly favours certain ethnic minorities. While LLMs may complement humans in hiring by making a (relatively extensive) pre-selection of job candidates, the findings suggest that they may miss-select due to a lack of contextual understanding and may reproduce pre-trained human bias at scale.
‘The state of rental discrimination: A meta-analysis across five groups in the housing market’ with Pieter-Paul Verhaeghe1, Abel Ghekiere, Louise Devos, and Stijn Baert.
More than seventy-five years after the Universal Declaration of Human Rights affirmed equal access to adequate housing, evidence continues to show that prospective tenants are treated unequally on protected grounds. Such rental discrimination, i.e., behavioural unequal treatment linked to wider systems of exclusion, imposes individual burdens (e.g., time, cost, stress) and produces societal harms, including deepening inequality and spatial segregation. Building on the “gold standard” of correspondence experiments, this study meta-analytically synthesises field evidence from 2002-2024 to compare discrimination across five grounds, i.e., ethno-racial origin, gender, disability, sexual orientation, and social origin, and to assess how rates vary by country, time, and provider and application type.
‘A global comparison of hiring and housing discrimination across five groups’ with Louise Devos, Stijn Baert, and Pieter-Paul Verhaeghe.
While extensive empirical research documents discrimination in labour and housing markets, comparative insights across these domains remain limited. This meta-study addresses this gap by juxtaposing levels of discrimination across five legally protected grounds-race and ethnicity, sex and gender, disability, sexual orientation, and social class and wealth-in both markets. We apply hierarchical Bayesian and unrestricted weighted least squares meta-regression to global data from correspondence audit studies conducted from the 2000s to the 2020s. Doing so, we account for the metadata’s multilevel structure, including temporal and geographical factors, and control for potential publication bias. Our meta-analytic results uncover structural differences in discrimination, with racial and ethnic discrimination being greater in the labour market and discrimination based on wealth and social class being higher in the housing market. We attribute these discrepancies to a heightened focus on productivity-related features of racialised and ethnic applicants in the labour context and on the financial solvency of tenants in the housing context.
‘Hiring discrimination and ethnic penalties’ with Dries Lens1.
No abstract yet.
‘Age bias in ChatGPT’s vs. recruiters’ assessment of resumes: The role of job age-type’ with Maaike Schellaert1, Janneke K. Oostrom, and Eva Derous. (Proposal accepted, Journal of Business and Psychology)
Artificial intelligence (AI), including large language models such as ChatGPT, combined with an aging workforce, poses challenges for organizations in designing selection procedures that are both efficient and fair. While AI can streamline tasks such as resume screening, it also raises concerns about potential biases embedded in the training data. Drawing on impression formation theories and prototype matching theory, two experimental studies (Study 1: within-subjects; Study 2: between-subjects) investigated the effects of applicant age, applicant gender, and job age-type on job suitability ratings, comparing evaluations by HR professionals with those generated by ChatGPT. Results from both studies showed that HR professionals rated older applicants less favorably than equally qualified younger applicants, with job age-type moderating this effect only in the within-subjects design. Notably, ChatGPT model updates appear to have shifted bias patterns, with the latest version (ChatGPT-5) showing age bias only in the between-subjects design and no interaction between applicant age and job age-type in either the between- or within-subjects design, the latter in contrast to earlier versions. Moreover, in both studies, ChatGPT-5 was significantly less biased than HR professionals, emphasizing its promise for fairer resume screening and the need to further investigate the responsible use of generative AI in selection contexts.
‘Measuring and remedying hiring bias in large language models’ with Stefanie Sprong, Pieter-Paul Verhaeghe, and Valentina Di Stasio.
No abstract yet.
1First author.
This page was last updated on 19 January 2026.