It may be obvious to many that an association between two factors does not necessarily imply causation. We are frequently exposed to scientific reports that factor X is associated with disease Y, which are then altered to ‘X plays a key role in Y’, and we are led to believe that X causes Y. We also find reports on social media of individual exposures that are believed to have caused disease simply because they happened at the same time. Such reports have led to false claims about novel associations and their relationship as a cause of disease.
For example, a positive relationship between the amount of damage caused by fires and the number of firemen at the scene does not mean that sending more firemen to a fire causes more damage. Also, high coffee consumption may be associated with a decreased risk of skin cancer, probably because high coffee consumption is associated with indoor lifestyles and activities, and therefore less exposure to the sun. But coffee itself does not protect against skin cancer, and is therefore not causally related.
Food has been a prime candidate in the search for causes and cures for cancer as far back as the 18th century. In an innovative study, Schoenfeld and Ioannidis addressed the question “Is everything we eat associated with cancer?” The authors selected 50 common and familiar ingredients from random recipes in a popular cookbook, and queried them on PubMed for an association with cancer risk. They found articles for 40 of these ingredients that the authors of the articles claimed was evidence that they either increased or decreased the risk of cancer. Some ingredients had effects on both sides of the risk profile, and had unconvincingly large effects that tended to shrink in meta-analyses.
So how do we decide whether an observed association is evidence for causation or not? Students of epidemiology or public health are taught to differentiate between association and causation, but may be tempted to exaggerate the implications of association studies when they enter the ‘publish or perish’ world of academic research. An inherent weakness of observational association studies is that experimental studies may not corroborate their findings. Inferring causation from a single association study may therefore be misleading, and could potentially cause harm to the public. This is a major reason why preliminary results from association studies should be interpreted with caution, and if publicized, should be carefully presented, keeping in mind the aims of the study and ‘real world’ implications as opposed to statistical significance.
“All scientific work is incomplete – whether it be observational or experimental. All scientific work is liable to be upset or modified by advancing knowledge. That does not confer upon us a freedom to ignore the knowledge we already have, or to postpone the action that it appears to demand at a given time.” (Sir Austin Bradford Hill 1965)
In 1965, at a meeting of the Royal Society of Medicine, Sir Austin Bradford Hill outlined nine tenets to consider when deciding whether causation was a factor in an observed association. He clearly stated that he had “no wish, nor the skill, to embark upon a philosophical discussion of the meaning of ‘causation'”. The starting point in assessing a causal relationship is generally an observation of an association or correlation between an exposure and an outcome that may or may not be attributed to the play of chance. These tenets are as follows:
- Strength of association. This refers to the magnitude of the effect of the exposure on the disease compared to the absence of the exposure, often called the effect size. This is represented by the odds ratio, confidence interval and p-value. These measures should be considered together when deciding how strong or how real is an association.
- Consistency. The association remains even when other factors change, e.g. different time, place, location, ethnic groups, age groups, gender etc.
- Specificity. The causal factor is quite specific to the outcome. For example if coal mine workers exposed to coal dust develop black lung disease, whereas those not exposed to coal dust do not, then coal dust specifically causes black lung disease. Working in a coal mine may not be the causal factor, as a person may be exposed to coal dust outside a coal mine. In an association study, it is important to isolate what is specific to the disease process to determine if the association is causal.
- Temporality. This answers the question: Which came first? You would expect that if an exposure causes a disease, then the exposure should necessarily precede the disease development.
- Dose-response or Gradient. Evidence that an exposure causes a disease may be related to a certain quantity or dose of the exposure, in which case you may see varying degrees of disease depending on the extent of the exposure.
- Biological Plausibility. Plausibility asks the question: Could the observed results fit into an established biological theory if it existed? This helps argue for causation, but it is not absolutely necessary as the understanding of the disease biology may be immature.
- Coherence. Assessment of causation should not conflict with existing knowledge of disease biology. Coherence asks the question: If the association is indeed causal, would it fit into an existing biological theory? The difference between ‘plausibility’ and ‘coherence’ is subtle. Coherence assumes there is an existing biological theory, and rejects the result if it does not fit into that theory, while plausibility at least allows for it in the absence of mature science.
- Experimental evidence. This has also been called ‘challenge–dechallenge–rechallenge,’ meaning, if we prevent the exposure, is it likely to prevent disease, and if we re-introduce the exposure, does the risk of disease return? The best experimental evidence for causation comes from randomized controlled trials, although in some circumstances this may be unethical.
- Analogy. Causation by analogy implies that if an exposure is known to cause disease, then it is highly likely that a similar exposure under similar circumstances will also cause disease.
Bradford Hill tenets are not meant to be a checklist for assessing causation, nor are they intended to be adhered to pedantically, but should serve as a guide when evaluating whether an exposure might be causally linked to a disease. It is unlikely that any single association study will satisfy all criteria for causation, but any given study may address some of the nine tenets, or none at all.
In fact, inferring causality may not require association studies or even a significant p-value. A classic example of this was the drug thalidomide that was approved in Europe in 1957 for combating morning sickness among pregnant women. Subsequently, an explosion in the incidence of neonatal deaths and congenital birth defects, of a type that can only be described as horrific and extremely rare, occurred almost simultaneously in 46 countries where this drug was approved¹. Clearly, any study attempting to associate thalidomide with birth defects would be unethical. Also, the extremely low prevalence of this type of birth defect in the general population, coupled with the striking increase in its prevalence in countries where thalidomide was prescribed, require no statistical measure of association to infer causation.
Bradford Hill tenets are not irrelevant or outdated, and provide useful principles for establishing causation. With new technologies and advances, various scientific disciplines may contribute to a better overall understanding of the disease process that can enhance the application of these criteria, and provide a stronger argument for or against causation.
In the absence of indisputable and compelling evidence that an association is causal, it is important to consider the entire body of knowledge and think through all sources of evidence when determining whether an exposure causes an outcome or disease. As a well-respected statistics professor of mine frequently reminded us, causality is a ‘thinking person’s business’, i.e. don’t let your computer or statistics program, or for that matter, anecdotal or biased reports, decide on the evidence.
As researchers we experience a eureka moment when the output of our statistics program generates a p-value with many zeros after the decimal point, i.e. a negligible probability that our finding was due to chance, but become crest-fallen when a validation study generates a p-value suggesting that our earlier result was well and truly a chance finding. This does not mean that the study should be filed away in the endless repository of unpublished science, nor should it be spun into something it is not. Negative results, assuming they were based on sound methodology, are not failed research, but are an important part of ultimately assessing causality and obtaining definitive answers to research questions, as highlighted in my previous blog on systematic reviews.
In a future article I will go into more detail on some of the common statistical measures that are reported in the scientific literature and their implications for ‘real world’ evidence.
¹More on the Thalidomide tragedy can be found in the book ‘Suffer the Children: The Story of Thalidomide’ that chronicles this disaster.
At SugarApple Communications we can help you find the best way to communicate with your intended audience and assist with writing, editing and statistics. Get in touch today and let’s talk.