Selecting the best..from the best - Part I
In the 70’s, the success rates for the EMBO long term fellowship scheme (now called EMBO Postdoctoral Fellowships), were roughly 50%. It dropped to 23% in 2001 and 16% in 2015. The number of applications to MSCA postdoctoral fellowships increased from about 7400 in 2014 to nearly 11500 in 2020. The 2025 call received close to 17000 applications and the success rate dropped below 10% for the first time. The success rate for applicants to the 2025 Human Frontier Science Program postdoctoral fellowships was 7.4%. Last year, we witnessed success rates of about 2% for the European Innovation Council Pathfinder Grants and an astonishing 0.93% for the Gen AI for Africa call, within the Horizon Europe framework. Even if we consider the recent figures to be anomalous and in part due to the recent turmoil in research funding in the United States, it is undeniable that the selection pressure for fellowships and grants has greatly increased in the past decades.
The decline in success rates results from an increase in the number of applicants combined with a non-equivalent growth in funding resources. More people, less money per capita. Between 1998 and 2017 the number of doctorates per year in scientific disciplines in OECD countries nearly doubled. Gross domestic spending on R&D, by comparison, only increased 18% in the two decades from 2000 to 2020. And this does not only affect the competition for grants, fellowships, and tenured academic positions, now available to less than 20% of doctorates globally, but also increases the selection pressure for jobs in related non-academic sectors (pharma, scientific publishing, tech companies).
The Problem of Evaluating Scientists
1) So Many Scientists, So Few Grants
One can argue that success rates of 50% guaranteed funding of all the best candidates and then some. Given the uncertainty associated with the scientific enterprise, being able to fund all deserving candidates and some of the not so clearly deserving ones, at least on paper, is relieving to decision committees, funding agencies and the scientific community. It is nice to walk out the meeting room knowing that nobody with a great application was rejected and you even gave an opportunity to applications that may positively surprise you. But we are now in general very far from this comfortable situation and have been for quite some time, especially if we focus on highly prestigious grants, where success rates below 15% have been the rule for at least a decade.
These rates force evaluators to discard a significant part of the deserving candidates, and take fewer (or no) risks on candidates who’ve strayed from a linear path of academic success. In practice, we must conclude that the person in position 101 of the ranking is worse than their fellow applicant in position 100, and therefore only the latter will get the grant. And that despite knowing that there is no objective difference in quality between many of the awardees and many of the rejected candidates. Yet we continue pretending there is.
2) Subjectivity and bias translated into decimal points
The cut off score for the 2025 call of the MSCA European Postdoctoral Fellowships for life sciences was 96.8 out of 100. This means that candidates with a score of 96.7 (or even some with 96.8) did not get the fellowship, which may not be a problem if we were ranking candidates according to their height, weight or how far they can throw a pipette from the lab. There is no possible counterargument, numbers are numbers. But we are ranking candidates, and this is true for essentially all scientific funding programs, according to uncountable concepts such as the “quality and pertinence” of the project to be developed, how “ambitious” it is, or the “quality” of the applicant. No wonder research, including my own study on applicants to the EMBO Long Term Fellowship Programme, has shown time and again that peer-review, the usual selection mechanism, has a limited utility.
While perfectly capable or distinguishing very good from very bad applicants, it fails at more granular decisions, where very good candidates must be picked from…also very good candidates.
And the problem does not stop there: how do you evaluate concepts like quality in a researcher or a project? The obvious answer is to read the scientific production of the researcher, consult with experts in the topic, consult with the applicant’s colleagues, interview the candidates, devise some tests for scientific or other types of skills… You see the point, right? It will take a decade to evaluate the thousands of candidates applying in just one single calendar year. But we have found a good solution for this: impact factors, H indexes, plain citations, number of papers… These are objective measurements. Maybe not of quality, but they are objective measurements of something, and we have been using them for quite some time.
It gets worse: when evaluators run out of rational, apparently objective differences between very close candidates (same number of papers, same impact factors, similarly ambitious project, same 96.8 score), they still need to make a decision. Decision making under uncertainty, as in this case, causes personal preferences and biases to take a prominent role in the process. It may be conscious or unconscious, it may be your closeness to the topic, your empathy towards candidates that remind you of yourself, the linguistic ability of the candidate, their gender, their ethnic group, their PhD supervisor… There are more than 100 sources of cognitive biases described in humans.
I know, I know; OTHER humans, not you, of course.
Do we have a solution?
The answer is no, BUT…
To be continued: tune in for Part II next week…
Comments ()