( 2508 Words (10 min read) )
There is a replication crisis in the social sciences. How much can we trust the research findings published in psychological journals? We can’t be certain about the exact answer to this question and perceptions might vary depending on who you ask but there is a growing consensus that within psychology, many of the published findings are not as trustworthy as they should be. For example, in a recent survey of scientists in nature, 80% of researchers agreed that there was either a slight or significant crisis in the field (whatever a slight crisis is!?)
Why do people doubt the veracity of psychological research and think the field is in a crisis? There are multiple reasons why researchers think the field is in crisis. First, there have been several repeated failures to replicate the effects of many published findings in psychology (Camerer et al., 2018; Open Science Collaboration, 2015). In these efforts, multiple labs set out to replicate the methods of hundreds of published studies and retested these experiments in a new (and typically much larger) sample of participants. As an observer, what you might reasonably expect to see here is that the majority of findings would demonstrate the same effects as were reported in the original study and, due to chance, a small number of these studies wouldn’t replicate. But, the rather earth-shattering outcome of these replication efforts is that only a minority of studies actually replicated the original research findings. Notably, several high-profile effects have been brought into question. For example, the concept of power posing was promoted in a TED talk by Amy Cuddy that has been viewed more than 60m times, yet the research that forms the basis of this effect does not always replicate. Social priming research has emphasised how subtle cues in the environment can affect our behaviour, often in dramatic ways. For example, unscrambling some sentences that include concepts related to elderly people can make us walk slower, holding a hot drink can make us think more warmly of others. This research has featured in many bestselling pop psychology books, such as Daniel Kahneman’s Thinking Fast and Slow, yet the construct of social priming is another area that has major replication issues. Thus, the low replicability rate in psychology presents a challenge to the credibility of the field and has resulted in much attention being directed to understanding both the causes of this replication “crisis” and how researchers can reform practices to increase the replicability, robustness, and credibility of the research.
The cause of the replications crisis: questionable research practices. So what underpins this credibility crisis? Why are many original effects not as trustworthy as they should be? Well, it turns out that many of the norms of the research process favoured the publication of false-positive findings (in other words, reporting effects when no such effect exists). Among the most basic principles that guide scientists is a commitment to objectivity, honesty, and openness (Committee on Science & Panel on Scientific Responsibility and the Conduct of Research, 1992; p.36). And we would expect that these principles should guide best scientific practices during the design of experiments, analysis of data, and communication of knowledge. In other words, researchers should, in practice, be as honest and as objective as possible when designing and reporting their findings. However, individual biases, norms, and systemic pressures can cause these principles to be corrupted during the research process.
For example, researchers are primarily incentivised for publications, yet the probability of a paper being accepted for publication in a journal is traditionally associated with the novelty (is it new and surprising) and statistical significance of the reported effects. The consequence of this publication bias is that researchers can be (implicitly or explicitly) motivated to exploit degrees of freedom in the way data is collected, analysed, and reported, that favour producing a publication with statistically significant (as opposed to null) effects (Simmons et al., 2011).
Questionable Research Practices (QRPs) is a term used to refer to the broad range of decisions in the design, analysis, and reporting of research that all increase the likelihood of achieving a statistically significant result, and therefore a positive response from journal editors and reviewers; for example., p-hacking, cherry picking, and Hypothesizing After the Results are Known (HARKing). P-hacking refers to researchers exploiting flexibility in the analysis; for example, by running statistical analyses with and without including covariates, deciding whether to include or exclude outliers on the basis of their impact on the effect of interest, etc (Simonsohn et al., 2014; Wicherts et al., 2016). Cherry picking refers to the practice of failing to disclose all the dependent variables and conditions in a study and instead only reporting the outcome measures and/or conditions that reach statistical significance. HARKing refers to the practice of presenting findings that were unexpected or unplanned as if they had been originally predicted and were tested using confirmatory hypothesis testing (Kerr, 1998). In summary, one explanation for the low replication rates is that many original findings are based, at least partly, on studies that involve the use of questionable research practices.
Widespread prevalence of QRPs in the West. To what extent are researchers actually engaging in QRPs? To answer this question, John, Loewenstein and Prelec asked 2000 psychologists in 2012 about their involvement in QRPs. Rather remarkably, a high percentage of US psychologists openly admitted to having engaged in QRPs and they believed their colleagues engaged in them even more than they admitted to doing so!
Similar admissions of QRP use have been reported in an Italian sample of psychologists (Agnoli et al., 2017), and across other fields such as education (Makel et al., 2019) and ecology and evolution (Fraser et al., 2018).
When researchers flexibly use combinations of these QRPs, the likelihood of being able to find a statistically significant effect can dramatically increase, irrespective of whether there was actually an effect in what you were testing. For an illustration of how easy it is to find significant effects with subtle (and often justifiable) analysis decisions see: https://fivethirtyeight.com/features/science-isnt-broken/#part4
The credibility revolution and reforms to research practices. So what is being done to improve the way research is conducted? In response to the replication crisis and growing awareness of the problematic nature of QRPs, scientists have proposed several reforms to research practices aimed at improving the transparency, and therefore the credibility of research. These reforms includepractices designed to increase the transparency in research practices: (1) preregistration, (2) open data; and reforms to improve research design: (3) power analyses.
1. Preregistration refers to the practice of explicitly stating one’s hypotheses and analysis plans prior to data collection. For example, a researcher would make clear what their primary outcome variable is, how they will define outliers, when they will stop collecting data, etc. Specifying these details in advance reduces researcher degrees of freedom in the analysis and makes it clear to other researchers what parts of your results are based on confirmatory hypothesis testing versus exploratory tests. A more advanced form of preregistration – which is becoming increasingly common – is to submit your introduction and proposed methods and planned analysis directly to a journal for peer review. In a registered report, the manuscript is evaluated on the basis of the introduction, methods, planned analysis (everything but the results!). If accepted, researchers are given a guarantee that provided they conduct their research according to the prespecified methods, the journal will publish their research, irrespective of the results! There are two key benefits to this approach to research. First, researchers get peer review feedback before the research is conducted which can allow improvements to the study design. Second, it reduces publication bias and encourages objective reporting of results.
2. Open data is the practice of sharing primary research data and materials (e.g., stimuli, code, etc) to a publicly accessible online repository. When psychology (and other fields) began there was no medium through which researchers could share their raw data alongside publications. This is no longer the case and there is a steady shift towards encouraging, or in many cases requiring, researchers to make their data available. Sharing data has multiple benefits for both the original researcher and for science more broadly. For the researcher, sharing data obviously makes it much more accessible for others, but a key benefit is that it also makes it accessible for your future self, if you want to return to the original data and conduct any reanalysis. Sharing data also makes us more accountable and more likely to avoid confirmation bias and be more rigorous in checking our analysis and data when we are aware that others may check this too! Data sharing accelerates scientific knowledge by allowing other researchers to build on your findings, avoid duplicate data collection efforts, include evidence in meta analysis, and also helps detect fraud or errors in reporting.
3. Power analysis refers to the practice of conducting a power calculation, before collecting any data, to estimate and justify the sample size required in order to detect a specified effect. This is important because running underpowered research designs is unethical, a waste of resources, and is a central cause of untrustworthy research findings in the published literature. Thus, researchers need to design studies that are powered to detect the effects we are expecting to observe.
Adoption of Research Practice Reforms. These research practice reforms are designed to increase the transparency and robustness of the research process, thereby reducing opportunities for researchers to use QRPs, reducing the chance of false-positive findings, and ultimately improving the replicability of research. Within Western academia (North America, Europe, Australasia), there is a growing recognition of the problematic nature of QRPs and researchers are responding to this with lower (reported) use of QRPs and a willingness to adopt reformed research practices (Motyl et al., 2017). This increased engagement in research practice reforms (i.e., preregistration, data sharing, and power analyses) is driven by several factors. At one level, researchers are engaging with research practice reforms because they recognize that these practices are good for science and lead to more reliable, robust, and trustworthy research. At another level, research practice reforms are also driven by editorial policies, education, cultural norms, and incentives. For example, editorial policies now often require authors (as a condition of acceptance) to report a power analysis when justifying their sample size. Societies (e.g., Society for the Improvement of Psychological Science) have emerged dedicated to discussing best research practices, many psychology departments have open science clubs (e.g., ReproducibiliTea) that facilitate training and sharing knowledge of research practice reforms (e.g., where to go and how to write an effective preregistration). Journals are incentivising preregistration and data sharing (e.g., through the use of badges at Psychological Science) and many journals have adopted registered reports as a format to encourage researchers to submit preregistered research that can be offered in principal acceptance prior to data collection. In summary, researchers are rapidly adopting reforms to their research practices to make science more transparent and credible.
A local challenge. Although many of the normative practices of researchers based in the West (i.e., North America, Europe and Australia) are reforming, few efforts have been made to examine how research practices are changing more locally in Thailand and South East Asia. To date, there is little evidence on how local researchers (1) perceive there to be an issue with the credibility of psychological research, (2) are motivated to adopt research practice reforms, and (3) face different barriers in their adoption of these reforms.
Concerning perceptions of credibility, there are far fewer open discussions around metascience in South East Asia that may result in less awareness and recognition of some of the systemic issues in research practices. Similarly, there is little knowledge of the degree to which researchers in the region are motivated to adopt new practices. As an illustration of these combined challenges to have open discussion and understand local attitudes, a recent study attempted to include the attitudes of members of the Asian Society for Social Psychology (AASP) in their international survey on the use of QRPs and research practice reforms. Unfortunately, AASP declined to distribute the survey to members “… out of fear that our survey would give its membership cues that questionable research practices are normative, increasing the likelihood that its members would use those practices” (Motyl et al. 2017, p. 38). Understanding the local barriers to research reforms is also an important endeavour because many of the proposed reforms have been developed by researchers who are based in Western regions, and focus on how these reforms apply within their own Western research contexts. Although the basic principles behind conducting and reporting robust research are universal; the practices that researchers adopt may differ across regions, and there are inequalities in the incentives and barriers experienced by researchers in different contexts. For example, the barriers to engaging in practices such as data sharing are reducing in the West through education, training, and access to resources. Likewise, the incentives (both tangible and intangible) for data sharing are growing with transparent (i.e., open) research practices becoming a desirable (or required) criterion for academic employment in some departments and kudos attached to research (and researchers) that shares open data. In contrast, far fewer training opportunities exist for raising awareness of these practices within the SE Asia region and there are limited incentives for their uptake. Thus, an important direction for future work is to focus on local challenges to research practice reforms in Thailand and South East Asia. In other words, what are the barriers that local researchers face when implementing research practices such as preregistration, sharing open data, etc.? In addition to the absence of incentives, there may exist many culturally grounded barriers to reform. As an example of how culture may affect reform, many of the research practice reforms in Western cultures have been driven by graduate students and early career researchers challenging the status quo (and often the approach of more senior academics). However, the relatively tighter cultures in South East Asia typically favour conforming to the values, norms, and behaviours adopted by others and sanctioning deviance and the promotion of reforms. Thus, there might exist much stronger resistance to the adoption of counternormative behaviour in the region.
In summary, the research practice reforms adopted by most psychologists mean that the field is producing higher quality and more replicable effects now compared to ten years ago. However, there remains much to do and there is currently a need for trying to understand the context-specific challenges perceived by researchers in the SE Asia region and to develop solutions to tackle barriers to engaging in best scientific practices.
To stay updated with issues related to open science in the region follow the South East Asian Network for Open Science (SEANOS) on twitter here: https://twitter.com/opensciencesea
 There is evidence for low replicability across many other research fields; e.g., economics (C. F. Camerer et al., 2016; Ioannidis et al., 2017), pharmacology (Lancee et al., 2017).
 Other examples of QRPs not stated here include: selective reporting of significant studies and file drawing null effect studies; conducting low powered studies.
 This is not true across the entire region, i.e., open science is prominent in Indonesia.
โดย Dr. Harry Manley
About the Author.
Harry Manley is a lecturer in the Faculty of Psychology at Chulalongkorn University. Twitter @harrisonmanley