Article Text
Abstract
Clinical risk prediction models can support decision making in emergency medicine, but directing intervention towards high-risk patients may involve a flawed assumption. This concepts paper examines prognostic clinical risk prediction and specifically describes the potential impact of treatment effects in model development studies. Treatment effects may lead to models failing to achieve the aim of identifying the patients most likely to benefit from intervention, and may instead identify patients who are unlikely to benefit from intervention. The paper provides practical advice to help clinicians who wish to use clinical prediction scores to assist clinical judgement rather than dictate clinical decision making.
- clinical assessment
- research
- statistics
Statistics from Altmetric.com
Using clinical risk models to predict outcomes: what are we predicting and why?
Clinical risk prediction uses patient and disease characteristics to estimate the probability of a current diagnosis (diagnostic), future outcome (prognostic) or treatment effect (therapeutic), and thus assist clinical decision making.1 This paper focuses on prognostic clinical risk prediction of a future outcome. Increasing availability of large data sets linking clinical features to outcome data is creating huge opportunities to develop clinical prediction models. Data analytical methods, such as machine learning, increase the sophistication of modelling techniques but may limit the clinician’s ability to interpret the risk that the model is predicting. If we want to use clinical risk prediction to improve care, we need to understand how prediction works and what it is predicting.
How can we use clinical risk prediction?
Clinical risk prediction typically uses a mathematical model to predict the risk of an adverse outcome based on clinical measurements or the presence of clinical features.1 Clinicians can then use the predicted risk to direct or assist decision making.2 The model can be simplified into a score, with a higher score predicting a higher risk of adverse outcome. Table 1 lists some commonly used scores in emergency medicine. Patients with higher risk are typically directed towards intervention (admission, investigation or specialist referral) or monitoring to determine the need for intervention, while those with lower risk may avoid intervention. This is presumably based on the assumption that intervention will reduce the risk of adverse outcome. However, as emergency care systems increasingly manage people with more severe frailty and multiple long-term conditions, we should question this assumption. If the risk of adverse outcome relates to underlying frailty or long-term conditions, then intervention directed at the acute pathology is unlikely to be effective. We therefore need clinical risk prediction to identify who will benefit from intervention, not just who is at risk of adverse outcome.
It may be argued that the role of emergency medicine is simply to identify risk of adverse outcome and leave others to determine whether or how the risk can be reduced. This rather impoverished view of the specialty neglects the harm associated with futile intervention, raising unrealistic expectations, loss of independence and social support following unnecessary admission, and the knock-on problems of access block and ED crowding.
How are clinical prediction models developed?
Understanding how clinical prediction models are developed can help us understand how they work. The process of developing a clinical prediction model is relatively simple and involves sampling the relevant patient group, recording potential predictor variables, and then recording outcomes over a specified follow-up period. Statistical methods are then used to identify the strongest independent predictors of outcome and combine them into a model.1
Measuring the association between predictors and outcome is simple in principle, but encounters a substantial practical problem, in that patients in the study receive treatment to reduce their risk of adverse outcome. If treatment is effective, then those who benefited from treatment will avoid adverse outcome and will not be predicted by the model. This means that predictive performance for avoidable adverse outcome may be underestimated, important predictors for guiding treatment may be missed and the model may fail to predict cases that will benefit from treatment. Conversely, if adverse outcomes occurred despite treatment, the model may be developed to predict unavoidable adverse outcomes where treatment is futile.
Box 1 describes examples where treatment effects may undermine the value of clinical prediction models for identifying patients who will benefit from intervention. Evidence that treatment effects undermine clinical prediction models may be limited but logic suggests that using clinical prediction scores developed on a treated population to guide treatment decisions must be problematic. Either treatment works, in which case treatment effects cause the problems outlined above, or it does not, in which case there is little rationale for treating those at highest risk.
Examples of the potential implications of treatment effects
Predictive accuracy may be underestimated
Uffen et al showed5 that the quick Sequential Organ Failure Assessment (qSOFA) Score had poor prediction of 30-day mortality (c-statistic 0.68) but patients with a qSOFA Score of 2 or more had higher odds of receiving aggressive fluid resuscitation (OR 8.8), early antibiotics (8.5) or vasopressors (17.3). If treatment prevented deaths in those with higher qSOFA Scores, then this may explain the poor prediction of death.
Important predictors may be missed
Cheong-See et al 6 showed that hypertension did not predict adverse outcome in the Pre-eclampsia Integrated Estimate of Risk (PIERS) model for women with pre-eclampsia. The authors noted that women with hypertension are more likely to receive intervention, which may explain why hypertension did not predict adverse outcome.
Failure to predict cases that benefit from treatment
Researchers who developed the 4C mortality score for COVID-19 suggested that patients with a low score could be managed in the community.7 However, a subsequent editorial8 noted that low-risk patients were managed in hospital in the study, where they may have received interventions to reduce mortality. So it should not be inferred that mortality would be low if they were discharged.
Prediction of cases that may not benefit from treatment
The PRIEST clinical score was developed to predict a composite outcome of mortality or organ support in patients with suspected COVID-19.3 Patients with a higher National Early Warning Score (V.2), older age, male sex or lower performance score had a higher risk of adverse outcome. The intention was that the score would direct patients with a higher score to hospital admission or critical care referral. However, secondary analysis of the PRIEST cohort9 showed that older age, limited performance status and abnormal physiology were associated with increased recording of an early do not attempt cardiopulmonary resuscitation decision.
This issue is specific to prognostic risk prediction of future outcomes (the focus of this paper). In diagnostic risk prediction, the reference standard diagnosis is already present. However, developers and users of clinical risk prediction must be clear whether prediction is diagnostic or prognostic. For example, we could use the Emergency Department Assessment of Chest Pain Score (EDACS, see table 1) to diagnose an acute coronary syndrome or predict future adverse cardiac events, but using it to do both risks confusion.
How can we address this problem?
If we understand a clinical prediction model and the research used to develop it, then we can use the model better to improve patient care. The following examples show how we can examine the model or score to gain insights about how it works in practice.
How well does the model predict intervention?
Models often have better prediction for adverse events, such as mortality, than for interventions, such as critical care admission. For example, the PRIEST clinical score showed better prediction for mortality without organ support (c-statistic 0.83) than for organ support (c-statistic 0.68).3 There are various reasons why a model developed to predict adverse outcome (or a composite outcome) may not predict use of an intervention, but if the model has poor prediction for the intervention we intend to use, it seems inappropriate to use the model to direct patients towards that intervention.
How is each predictor variable associated with intervention and outcome?
Examining each individual predictor variable can explain why models predict adverse outcome better than intervention. Secondary analysis of the PRIEST data showed4 that age over 80 years, lower functional status, and reduced consciousness level were associated with increased mortality and reduced use of organ support. These variables predict higher risk in the score, explaining the poor prediction of need for organ support. In the study, clinicians may have used older age, lower functional status and reduced consciousness level to identify patients who were unlikely to benefit from intervention (see box 1) and then withheld potentially futile intervention, although we should also consider the possibility that intervention could have prevented adverse outcome.
Do the predictors reflect acute illness or underlying comorbidity and frailty?
Clinical knowledge can be used to judge which predictors in a tool are reflecting acute illness or underlying comorbidity or frailty. Tables 2 and 3 show how the predictors in the Pulmonary Embolism Severity Index and EDACS include characteristics that may reflect acute presentation or pre-existing conditions. Determining which characteristics contribute to the high score may allow more appropriate use of these scores in supporting clinical decision making, rather than simply admitting or referring those with a high score. This is not to suggest that treatment may not be beneficial in the context of comorbidity and frailty, or that treatment cannot influence underlying comorbidity or frailty, but that treatment should focus on the likely cause of the risk.
How can we make better use of clinical prediction scores?
Clinical prediction scores can support but should not dictate decision making. This means that guidelines should not use clinical prediction scores to direct patient management but may recommend calculating a score to inform clinical judgement. If we use a score to estimate the risk of an adverse outcome, then we should recognise that this probably represents the risk for a treated patient, so we should not give false reassurance that intervention will eliminate the risk. If the tool suggests a significant risk of adverse outcome, then we should use clinical judgement to determine whether the risk is related to acute illness or underlying frailty and long-term conditions. We should then be honest and realistic about the potential for intervention to reduce the risk.
Making better use of clinical prediction tools requires better evidence. Studies of clinical prediction models should report prediction for interventions and outcomes separately, so readers can judge how well the model predicts intervention and how each predictor is associated with intervention and outcome. Studies could use expert adjudication to determine whether outcomes were avoidable or were avoided through intervention. Finally, big data and data analytics can produce models that predict adverse outcomes with impressive accuracy and narrow CIs, but we still need to understand how prediction works and whether treatment can prevent adverse outcomes. This may be difficult if data analytics researchers do not describe their complex models in a transparent manner. Box 2 summarises the key messages for users and developers of clinical prediction scores.
Key messages
For users of clinical prediction scores
Do not use clinical prediction scores to dictate decision making—always use clinical judgement.
Think about what the clinical prediction score is predicting—is it predicting adverse events that occurred despite treatment?
Consider whether the risk of adverse outcome is related to acute illness or underlying frailty and long-term conditions.
Consider whether intervention, particularly hospital admission, can reduce the risk of adverse outcome.
Beware of using a clinical prediction model if you do not understand how it works.
For developers of clinical prediction scores
Record interventions that may prevent adverse outcomes.
Report prediction (of the model and individual predictors) for interventions and outcomes.
Consider using expert adjudication to determine whether outcomes were avoidable or whether intervention prevented adverse outcome.
Draw on clinical expertise when developing a score or rule from the model.
Present the findings as information to support, rather than dictate, decision making.
Ethics statements
Patient consent for publication
Ethics approval
Not applicable.
Footnotes
Handling editor Richard Body
Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.
Competing interests SG was the Chief Investigator for the PRIEST Study.
Provenance and peer review Not commissioned; externally peer reviewed.