“It is the theory that decides what we may observe”
—Einstein (as quoted by Heisenberg)
Many of the initiatives proposed to improve the social and life sciences focus on improving methodology and statistics. This is understandable, it’s where errors are easily made (and discovered) and it allows for relatively simple interventions, e.g. more stringent control on appropriate use of statistics by journals. However, the goal of generating empirical facts is ultimately because we want to find out which scientific claim about the structure of reality best explains why those empirical facts were observed.
The quote attributed to Einstein refers to an important, and grossly underestimated phenomenon one might call the theoretical tunnelvision. It is best explained by an example that is commonly encountered in the literature in psychological science and goes something like this:
- A study tries to find independent causes (predictors) of a certain disease-entity, a pathological state or behavioural mode people can ‘get stuck in’.
- Typically, a statistical model fitted on a large, representative sample of individuals in which many different predictors were measured will yield associations between predictor and disease-entity that are significant but small (on average \(r \approx 0.3\), or \(\approx 9\%\) explained variance).
- Often, if other known (non-clinical) covariates are included in a model, or, if the multivariate nature of the phenomenon is taken seriously by including repeated measurements and/or multiple dependent variables, these predictors will no longer explain any unique variance in the outcome measures.
Here’s an example of a ‘predictor’ study (Walker & Druss, 2015) to find predictors of persistence of Major Depressive Disorder MDD 10 over the course of 10 years in a representative sample of 331 individuals who suffered MDD 10 years earlier:
“Clinical variables in this analysis were not strongly associated with persistence of MDD over the course of 10 years. Comorbid generalized anxiety disorder, baseline depression severity, and taking a prescription for nerves, anxiety, or depression were significantly associated with persistent depression in the unadjusted logistic regression models, but the associations became non-significant when in the multivariate model. These findings are in contrast to the results from several other studies.”
The study concludes by discussing three factors that play a statistically significant role in the persistence of MDD (text between brackets not in original):
- “having two or more chronic medical conditions [in 1995-1996] contributes to experiencing depression ten years later. [2.89 more likely] However, only having one chronic medical condition did not increase the odds of being classified as having MDD in 2004–2006.”
- “days of activity limitation in 1995–1996 were significantly associated with a greater risk of depression ten years later, [2.19 more likely] independent of the number of chronic medical conditions a person had.”
- “Individuals who were in contact with family less than once a week [in 1995-1996] were more likely to have MDD in 2004–2006. [2.07 more likely] Likewise, people who were married were less likely to have persistent depression compared to those who have never married [never married 2.42 more likely]”
So what’s wrong with these inferences? The study shows some previous assumptions about the relevance of clinical predictors should be reconsidered, and it adds to scientific record some facts about risk factors that might have eluded scientists, clinicians and health professionals. Let’s look at the main conclusion of the study, in addition to a plea for more attention for people with two or more chronic medical conditions, Walker & Druss (2015) end the article with:
Future research should continue to examine the complex nature of the relationship between chronic medical disorders and comorbid psychiatric conditions. Addressing these conditions and strengthening social support systems could be important strategies for reduce the burden of depression.
Here’s what is odd from the perspective of rigorous science:
If clinical predictors play no role in explaining why some people remain depressed for such long periods of time, why isn’t the main conclusion of the study that we must re-appraise the scientific theories laying explanatory claim on the aetiology of MDD? It is from these theories that the diagnostic tools, the medical, and psychological interventions to which these patients have been exposed, were derived.
Even though the authors acknowledge –and indeed show– that the propagation of a pathological state like MDD over many years is a very complex multivariate phenomenon, their suggestion for future research is still based on an implicit assumption about causation that is extremely simple. The idea is that there is a chain of unique (efficient) causes, each contributing independently to the emergence, and persistence in time of the MDD state. The authors basically suggest some component causes have to be added to the aetiology. The metaphor is that of a machine of which the sum output of its constituent components is equal to the purpose or function of the machine as a whole. Should a component fail, then it can be repaired or replaced as long as it performs the same function as the defective part, thereby restoring the function of the machine as a whole. This is why the authors suggest that strengthening social support systems could be an intervention to reduce the burden of depression: The absence of a partner or visits by family members were predictors that explained some unique variance in the data on the persistence of MDD. Obviously, restoring this defective social support component should restore or at least facilitate the escape from the MDD state. Meanwhile, they seem to forget that they convincingly argued that MDD is a very complex phenomenon that cannot be dissected into neat, independent component causes.
Very much related to the previous point: The authors mention three important factors in the discussion and conclusion section, however, the results section contains another factor that was omitted, it is in fact the second most important predictor of the persistence of MDD:
“Women had 2.48 the odds of remaining depressed compared to men”
Why did they ignore this predictor in the discussion? This is speculation, but could it be that this factor is not mentioned because it would have to be considered a ‘deficient’ component and suggesting any kind of ‘treatment’ intended to ‘repair’ it is of course beyond the realm of sane things to suggest. Nevertheless, it does seem rather important to figure out why women are 2.5 times more likely than men to still be depressed after 10 years. Perhaps not considering gender to be a unique causal component in a chain of independent predictors might help. Instead, gender could be considered a complex aggregate, or, contextual variable that is associated to the dependent variable through a vast network of interdependent facts, events and states of affair. An obvious factor of importance is that effect-studies of medical interventions are mainly conducted on white, male, 20-30 year old, right-handed, subjects with above average SES. Also, it is likely that on average, the stability of mood over longer periods of time is more variable in women than in men due to fluctuations of hormone levels, but also due to antenatal and postnatal depression (World Health Organization, 2002). It does not seem unreasonable to suggest this poses extra challenges for women who want to escape the MDD state.
The analytical tools selected by the researchers (a generalized linear statistical model) restricts the kinds of associations we might observe in the data. In the the present case all associations will –after transformation– be linear compositions of independent components.1 One never reads this valid equally valid conclusion: “We conclude that the linear model is inadequate to describe the complexity of this phenomenon.” The reason is that the implicit assumptions about causality underlying scientific claims never enter the empirical cycle and therefore escape falsification by the repeated application of the scientific method even though those causality assumptions are also based on a scientific theory about the structure of reality that is in principle falsifiable.
Naturally, if one would use mixed models we can account for dependencies in the data, but they will still be limited to linear associations.↩︎