Psychology as a Science
Today’s lecture aims to provide you with information about the lab report, and some of the motivations behind why the lab report is designed in the way it is.
Today’s lecture is in two parts:
Part I
Part II
Replication and Reproducibility? What’s the difference?
Reproducibility refers to the idea of taking a dataset (which another researcher may have collected) and running the same analysis as that researcher and getting the same results.
This might sound like it’s trivial, but it turns out that it isn’t! One of the reasons you’re learning R
and Quarto
in this course is so that you can learn how to do reproducible science.
Replicability refers to the idea of taking the methods (research design, stimuli, etc) from a previously run study, re-running the study, and getting the same results.
The lecture will mainly focus on replicability/replication, but I’ll also touch on reproducibility.
The spectre of failed replications.
Several large-scale replication attempts have shown that many classic findings in the psychology literature can not be replicated
Some estimates suggest that > 50% of findings aren’t replicable
This has prompted some to claim that psychology is in a state of crisis!
There are likely to be several causes of this crisis. These might include:
How statistics and statistical procedures are used and abused in psychology
Bias in which studies get published and which do not
The typical use of small sample sizes in psychology
Lack of clearly defined theories in psychological science
These causes probably aren’t independent but are likely to be interconnected and related to each other.
When we designed the psychology methods courses at Sussex, many of these issues were at the forefront of our minds.
In this lecture, I’ll focus on the causes that are most relevant in motivating the design of the lab report.
If we look specifically at the psychology literature we’ll notice something odd
The vast majority of published papers in psychology journals report findings that support the tested hypothesis
But how is this possible?
One source of bias in publishing of psychology studies is that journal editors and peer reviewers might not want to publish studies when they don’t support the tested hypotheses
This might especially be the case when new studies don’t show support for a famous or influential theory
Editors/reviewers might be more likely to suspect there’s some kind of a problem with the new study
Researchers might also choose not to submit studies for publication if they don’t support the tested hypothesis
It is very easy for researchers to engage in certain practices that invalidate their results
These practices make it so that researchers are more likely to find results that support a tested theory even if that theory isn’t true
Some examples include:
Running a statistical test, looking at the result, collecting more data, re-running the statistical test… rinse, repeat.. until you find the desired result
Collecting data under many different conditions and only reporting the conditions that produce the desired result
But if these are problems, then what is the solution?
One solution that has been proposed is pre-registration
The idea of pre-registration has been covered in popular media. For example, it’s been written about in The Guardian on several occasions (see the handout for more details)
Pre-registration can get around publication bias by allowing editors and reviewers to judge whether a study is likely to produce reliable results before the results are known
Pre-registration can also get around certain kinds of experimenter and statistical biases by making researchers specify their statistical and study methods in advance
Preregistration means that before conducting a study, researchers plan their study in detail
This means they can’t change their hypothesis to make it fit whatever their data happened to show (think about falsification and infinitely flexible theories!)
They can’t cherry-pick their data or engage in subtle procedures to make the data fit their hypotheses
By outlining their plans in detail, reviewers can judge
Whether the methods are scientifically rigorous
Whether the study is likely to produce clear (rather than ambiguous results)
And they have to do this all before seeing the results, which might otherwise bias their decision
In a special form of pre-registration known as a registered report, a journal actually agrees to publish a study before the data are collected.
This is possible because the pre-registration plan gives enough detail for editors/ reviewers to judge whether the study is scientifically sound
To see how a registered report works in practice I’ll take you through an example from my our research…
In 2003 a paper was published claiming to show that merely looking at numbers would cause a shift in attention to either the left or right side of space.
This finding was very influential with more than 700 subsequent studies citing this finding or building on it.
Some published studies tried to replicate it. Most showed successful replications and very few failed replications.
If you spoke to people at scientific conferences then many researchers would tell you that they couldn’t successfully replicate the effect…
But this wasn’t reflected in the scientific literature where most published papers on the effect showed that it could be replicated and where scientists continued to cite the original finding believing it to be true
The original finding was published in an extremely prestigious journal (Nature Neuroscience) and it quickly became influential…
This means it probably got accepted as something like an established fact
Once a finding is accepted as an established fact then journal editors and reviewers might be reluctant to publish studies that don’t support the original finding…
Note that it can sometimes, but not always, be very reasonable to not believe the results of a new study…
For example, if I did a study that showed that gravity doesn’t exist, then what is more likely?
That gravity doesn’t exist, or that my study is wrong?
But for other examples, it might be that the established theory is wrong
It’s best to try and judge studies based on their methods rather than being influenced by what the results are
In 2017 I put together a registered report that involved a replication attempt of the original 2003 attentional cuing finding and some additional experiments to attempt to understand the mechanism that produced the effect (that is if I could replicate it!)1
I then approached a journal with this plan to see if they were willing to publish the study if I did it according to the plan
The plan was sent out for review to be checked and then the journal agreed that they would publish it if I did it according to the plan
I then gathered together 30+ psychological scientists from 17 universities around the world and we ran the experiment on over 1300 participants (nearly 100 times the original sample size!)
We found absolutely no evidence for the original finding…
No evidence that the additional manipulations modulate the size of the effect…
Now scientists can move on from this finding, but a lot of resources have already been wasted studying it
This finding is not a unique case! There are likely many zombie findings in psychology
This lecture is primarily about the replication crisis but there might also be a reproducibility crisis on the way!
Reproducibility is one of the reasons you’re learning about R
, RStudio
and Quarto
in the practical sessions.
We can say a study is reproducible if:
We can take a dataset (from a published journal article) and re-run the analysis described in that journal article and get the same numbers
It’s difficult to test because researchers don’t typically share their data (so you can’t re-analyze it)
But data sharing is becoming more common, which means we might be able to test it!
In 2019, the journal Psychological Science published an issue where all 14 papers shared their data!
So we decided to see whether we could re-analyze the data and get the same numbers as the published papers
Of the 14, we found that only 1 was exactly reproducible
For 3 we could reproduce the numbers with only minor differences
For the remaining 6 we could reproduce some but not all of the analyses
That leaves 4** where we could only reproduce a fraction of the results or could not reproduce any results at all!
So what went wrong?
Some researchers didn’t share the correct/appropriate data
Key parts of the data were missing
Data wasn’t appropriately labeled
We want you to be better, so as part of your research methods courses you’ll also learn how to organize data appropriately
But the major issue was that often the analyses weren’t appropriately described. But why?
One reason might be that it’s difficult to describe analyses
This is especially true when analyses are complex
code
to improve reproducibilityInstead of only verbally describing analyses researchers can include code
with the shared data
But then psychological scientists need to know how to write code
!
Unfortunately, training in coding is still not typical in undergraduate psychology programs!
But things are changing, and Sussex (together with e.g., Cambridge, KCL, QUB, Edinburgh, Glasgow) is one of the universities changing this!
In fact, in our audit the 1 paper that was exactly reproducible was reproducible because
They shared the R
code
The manuscript was written using Quarto
Over to Dr. Terry…
The lab report is designed to be part of your training to do better science by introducing you to the idea of pre-registration!
The lab report will present a research plan for an experiment
The expected length with be around 1000–1500 words (with a maximum allowable length of 2000 words)
The research plan will address one of two questions
Is buying “green” (i.e., environmentally friendly) products driven by status motives?
Do women find men more attractive in conjunction with the colour red?
Links to two studies that have addressed this question can be found on Canvas
The lab report will have a similar structure to journal article, but without a results section
Introduction
Methods
Discussion (strengths and limitations only)
References
Your introduction should include the following information:
Thesis statement — What is the main research question/area you are considering? Think of this as an introduction to your introduction! Tell us broadly what the topic area is and why it’s important.
Background — What is the context for your research question, and what do we already know?
Introduce important previous research including previous ideas/theories/hypotheses that were tested
How did previous studies investigate these questions. Briefly explain the experimental designs (participants, methods, materials) that have been previously used
Critically evaluate whether these were appropriate or sufficient to address the research question. You should cite additional evidence to support any claims you make
Suggest a new experiment
Hypotheses — based on this background, what do you expect to happen in your experiment?
Given the questions and/or issues identified in your intro, propose a new experiment. E.g., you might suggest:
A new experimental design or paradigm. E.g., different test design, measuring the outcome in a different way, or changing the way the conditions are presented
A new variable or group manipulation to include
A new population test. You might suggest that studying or contrasting with a particular population may be able to provide further insight into the effect
You must minimally propose one improvement/modification to previous experiments. You can suggest several at once, but try not to over-complicate it.
You must justify your modification with previous literature
You should have a good, evidence-based reason to believe that the modification you are proposing would be an interesting and meaningful improvement on the previous design
Explain your reasoning clearly and thoroughly, so try to avoid modifications if you don’t really understand them (e.g., brain imaging)
Participants — Who will take part in the research?
From which population will your draw your sample?
You don’t need to specify exactly how many participants you’ll include. Instead, give the characteristics of the participants you’ll recruit
Materials — what kind of tests will be administered, and how do they work?
Design — What variables will be included? Will it be a between-groups or a within-participants design?
Specify your dependent and independent variables
Consider confounds and controls
Procedure — What instructions will be given to participants, what will participants do, and will the tasks be administered in a specific order?
What are the strengths of your design. For example, will it be able to tell you something about causation
What will the results not be able to tell you about your research question? Why?
Will this study need a follow-up study? Why?
There is a tendency to over-complicate things!
Don’t suggest changes that you don’t fully understand
Focus on doing the simple things well
Suggesting a complex experimental design is not impressive if you get it all wrong
But don’t go the other way and suggest no changes at all!
And don’t worry if you find the lab report difficult. Everyone will find it difficult!
For most, this will be your first experience doing something like this, but you’ll only learn how to do it by doing it!
You’ll be discussing the report more in practical classes this week, so make sure you go to them!