Objectives Designed to detect early deterioration of the hospitalised child, paediatric early warning scores (PEWS) validity in the emergency department (ED) is less validated. We aimed to evaluate sensitivity and specificity of two commonly used PEWS (Brighton and COAST) in predicting hospital admission and, for the first time, significant illness.
Methods Retrospective analysis of PEWS data for paediatric ED attendances at St Mary's Hospital, London, UK, in November 2012. Patients with missing data were excluded. Diagnoses were grouped: medical and surgical. To classify diagnoses as significant, established guidelines were used and, where not available, common agreement between three acute paediatricians.
Results 1921 patients were analysed. There were 211 admissions (11%). 1630 attendances were medical (86%) and 273 (14%) surgical. Brighton and COAST PEWS performed similarly. hospital admission: PEWS of ≥3 was specific (93%) but poorly sensitive (32%). The area under the receiver operating curve (AUC) was low at 0.690. Significant illness: for medical illness, PEWS ≥3 was highly specific (96%) but poorly sensitive (44%). The AUC was 0.754 and 0.755 for Brighton and COAST PEWS, respectively. Both scores performed poorly for predicting significant surgical illness (AUC 0.642). PEWS ≥3 performed well in predicting significant respiratory illness: sensitivity 75%, specificity 91%.
Conclusions Both Brighton and COAST PEWS scores performed similarly. A score of ≥3 has good specificity but poor sensitivity for predicting hospital admission and significant illness. Therefore, a high PEWS should be taken seriously but a low score is poor at ruling out the requirement for admission or serious underlying illness. PEWS was better at detecting significant medical illness compared with detecting the need for admission. PEWS performed poorly in detecting significant surgical illness. PEWS may be particularly useful in evaluating respiratory illness in a paediatric ED.
- emergency department
- clinical assessment, effectiveness
- clinical assessment, education
Statistics from Altmetric.com
- emergency department
- clinical assessment, effectiveness
- clinical assessment, education
What is already known on this subject?
Previous studies have shown paediatric early warning scores (PEWS) to be specific but not sensitive in predicting admission to hospital from the paediatric emergency department (ED). No study has yet looked at PEWS validity in predicting significant illness in children presenting to the ED.
What might this study add?
Our study suggests that PEWS is specific but not sensitive for predicting significant illness in the ED and is better at predicting significant medical illness, especially respiratory illness, compared with significant surgical illness. PEWS should, therefore, be taken seriously if abnormal but a low PEWS should not falsely reassure.
Various paediatric early warning scores (PEWS) have been used since the validation of the Brighton score in 20051 and were designed to detect the early deterioration of the hospitalised child.
The UK National Patient Safety Agency and the National Institute for Health and Care Excellence, together with the UK Confidential Enquiry into Maternal and Childhood report ‘Why Children Die’, recommended early warning scores to identify children in hospital with developing critical illness.2 ,3 Despite their widespread implementation, studies examining their efficacy vary in quality, making validation variable for different scores.4 ,5 With the introduction of the 4 h target in UK emergency departments (EDs), there is increasing pressure to make decisions on patient management.6 PEWS, which include the commonly used Brighton and COAST systems, were designed to reflect trends in physiological condition, thus allowing the early detection of deterioration and hence prompt timely intervention for the hospitalised child. However, the validity of a one-off ‘snapshot’ of physiological parameters in the ED to predict the need for hospital admission or illness severity is not validated. With PEWS use in the ED increasing,7 it is a necessity to know the predictive power of this tool to predict (1) hospital admission and (2) significant illness. Only two studies6 ,8 have specifically examined PEWS ability to predict hospital admission from the children's ED, using 424 and 1223 study patients, respectively. In both studies, the Brighton score was used. PEWS was found to be specific but not sensitive. No studies to our knowledge have examined their ability to predict serious illness in the children's ED. Two commonly used scoring systems that are advocated in the UK by NHS Improving Quality (previously known as the NHS Institute for Innovation and Improvement)9 are Brighton and COAST. Both use an abnormal HR, RR, work of breathing, level of consciousness, the need for supplemental oxygen and parental/medical staff concern to create an illness score. COAST was adapted from Brighton PEWS for use specifically in the ED whereby hypoxia, thought to be a more useful marker of illness severity rather than supplemental oxygen, is used to trigger a score. This study's aims are to estimate the diagnostic accuracy of two commonly used PEWS score (Brighton and COAST) (1) to predict hospital admission and (2) to detect significant illnesses among unselected paediatric patients presenting to the ED.
This was a retrospective analysis of routinely collected clinical observations recorded on arrival to the paediatric ED at St Mary's Hospital, London (Imperial College NHS Trust, UK). The patients studied attended in the month of November 2012, during which there were 2261 attendances. November was chosen as this was a busy month for paediatric ED attendances, likely owing to it being winter. Therefore, a larger sample size of patients was possible to study. Previous studies have used study populations between 87 and 1336;6 ,8 ,10–13 therefore, we regarded our sample to be of sufficient size. Data to calculate PEWS were retrieved from the ED Ascribe Symphony system and subsequently anonymised. The data for every attendance in the ED were included. All included PEWS were complete. Where data were missing, the patient's electronically scanned notes were examined, and if data were still missing these patients were excluded. Any incomplete scores were excluded as part of the 340 patients excluded.
Table 1 shows how PEWS were calculated for both Brighton and COAST scores. Each abnormal parameter scores 1 point with a maximum score possible of 6. The data used to calculate PEWS came from the first assessment when the child arrived in the ED.
Both Brighton and COAST score 1 point for parental concern. We automatically assigned a score of ‘1’ for every patient presenting to the ED on the basis that parents/carers, by bringing their child to the ED, were by default concerned.
Each patient outcome, that is, admitted, not admitted or sent to another hospital for further assessment, was recorded. If a parent/carer left the department before being seen by a clinician, this was classified as a non-admission. If a child was transferred to another facility urgently, this counted as an admission.
Diagnoses recorded on Ascribe Symphony were extracted for each patient. Owing to the large heterogeneity in diagnoses, diagnoses were grouped into two broad categories of illness: medical and surgical. Surgical included trauma, with medical encompassing all of the other diagnoses. Subdivision of diagnoses involved a ‘systems’ approach, for example, infection and respiratory. Each illness was then grouped into ‘minor’ and ‘significant’. ‘Significant’ was defined as a condition of sufficient potential severity that could result in acute morbidity/mortality. Where possible, national guidelines were used to classify diagnoses as minor or significant. We first reviewed the recorded diagnoses before us in the data set, and then searched for guidelines that graded severity. An example of a severity guideline was the British Thoracic Society guideline 2011 on the management of acute asthma.14 Where there was not an established guideline, common agreement based on the clinical opinion between the three authors, two of whom were consultants in paediatric emergency medicine (IM and GH) and one who was a paediatric registrar (PL), was used to create an illness severity classification. See online supplementary material for the details of the diagnostic grouping and classification. PL and GH each separately assigned 50% of patients to illness severity groups. To see whether PL and GH interpreted the illness classifications similarly, that is, had good agreement, a kappa calculation was performed on a random sample of 200 patients taken from the whole data set (10.4% of the total sample) after illness allocation was completed. PL and GH were blinded as to which patients they had analysed initially from this sample. Kappa was 0.74 (95% CI 0.57 to 0.91), which is regarded as good statistical agreement. Diagnoses were assigned to minor and significant groups before PEWS was calculated from the observational data; therefore, the authors were blinded to the diagnoses relationship to PEWS.
Methods of analysis
For both Brighton and COAST PEWS, the sensitivity, specificity and positive and negative likelihood ratios for PEWS of ≥2, ≥3 and ≥4 were calculated with relation to their predictive power of predicting (1) admission and (2) significant medical and surgical illness. Receiver operating curves (ROCs) and the area under the curves (AUCs) were generated for Brighton and COAST PEWS in relation to admission and illness severity. Data were analysed using Microsoft Excel 2010 and IBM SPSS Statistics V.22.0.
There were 2261 attendances to the paediatric ED in November 2012. Data to calculate PEWS were missing from the Symphony system in 565 cases. After examination of the electronic notes, missing data on 225 patients was retrieved and so these patients were included. The remaining 340 patients were excluded. The final data analysis was on 1921 patients (figure 1). Age range was 2 days to 17 years. All patients included in the analysis had their observations recorded upon arrival to the ED triage or within the ED resuscitation area (if taken straight to the resuscitation area). Baseline characteristics are summarised in table 2.
Figure 2 summarises the spread of all significant and minor illness, the breakdown of significant and minor medical and surgical illness, and further breakdown into illness subcategory. There were 211 admissions constituting 11% of attendances. Five patients transferred to another facility for urgent subspecialty care were classified as admissions. Attendances owing to all medical illness totalled 1630 (85%) and 273 (14%) had surgical conditions. The remaining 1% comprised deliberate self-harm with minor cuts, a safe-guarding case, and neonates with minor feeding difficulties. Of the total medical attendances, 268 (16.4%) were significant; and of the total surgical illnesses, 30 (11%) were significant.
AUCs for PEWS in relation to hospital admission and predicting significant illness
Table 3 summarises the AUCs for Brighton and COAST PEWS in relation to predicting hospital admission and significant medical and surgical illness. An illness subclassification is also displayed. The AUC (0.690) was identical for both Brighton and COAST PEWS with regard to admission, and so the Brighton ROC is displayed (figure 3). For hospital admission, PEWS would be regarded as a ‘poor’ test for diagnostic accuracy based on this AUC. With respect to significant medical illness, again Brighton and COAST performed very similarly with AUCs of 0.754 and 0.755, respectively. This would be regarded as a ‘fair’ test for diagnostic accuracy. As the ROCs were almost identical, only the Brighton ROC is displayed in figure 4. Overall, both Brighton and COAST scores performed poorly as a predictive tool for detecting significant surgical illness, with an identical AUC of 0.642. The Brighton ROC is displayed in figure 5. The AUCs for significant illness subcategories are displayed in table 3. The only significant illness subcategory in which PEWS performed as a ‘good’ test was respiratory illness with the AUC being 0.900 and 0.866 for Brighton and COAST scores, respectively. The Brighton ROC for respiratory illness is displayed in figure 6.
Sensitivity, specificity and likelihood ratios results for admission and significant illness
A summary of values is given for Brighton and COAST scores in tables 4 and 5, respectively. Brighton and COAST performed very similarly in all measures tested. A PEWS score of ≥3 was highly specific (93%) for admission, but only 32% sensitive. For significant medical illness, a PEWS of ≥3 was 96% specific but only 44% sensitive and with relation to surgical illness PEWS ≥3 was 100% specific and 10% sensitive. No patient had a PEWS ≥4 in the COAST group among those with surgical illness. For significant respiratory illness, Brighton was 74% sensitive and 90% specific and COAST was 75% sensitive and 91% specific for a PEWS ≥3. For all outcomes measured, positive likelihood ratios rose exponentially as the PEWS increased though negative likelihood ratios stayed close to 1.
PEWS are specific but not sensitive in predicting hospital admission and significant illness. ROCs demonstrated PEWS was a poor tool for predicting hospital admission and significant surgical illness, and fair at predicting significant medical illness. Based on these data, for all the outcomes measured (admission and significant medical and surgical illness) a PEWS threshold of ≥3 is the point closest to the top-left corner of all the plotted ROC, hence representing the optimal trade-off between sensitivity and specificity, with a specificity of >90% for admission and significant medical and surgical illness, and a sensitivity 32% for admission, 44% for medical illness and 10% for surgical illness. A PEWS threshold of <3 has poor specificity while a threshold of ≥4 has little increase in specificity with a decline in sensitivity owing to only a fewer number of patients with such high PEWS scores.
Previous work has examined illness severity in already hospitalised children in relation to PEWS, and a recent review of 10 studies encompassing a range of PEWS scoring systems included a total of 17 943 children.1 ,11–13 ,15–21 Area under the ROCs for predicting hospitalisation was poor to moderate (range 0.56–0.68) with sensitivity and specificity being 36.4–85.7% and 27.1–90.5%, respectively. Only two of the studies included ED patients: 46 patients admitted from the ED to paediatric intensive care department (PICU),17 and another study that examined the incidence of cardiorespiratory arrests before and after the introduction of a medical emergency team and thus did not examine PEWS’ direct relationship to general hospital admission.18 Scoring systems are used frequently within the paediatric ED with regard to triage such as the Manchester Scoring System.22 ,23 The Pediatric Risk of Admission Score exists to predict the risk of hospitalisation in children and has a sensitivity >80%,24–26 but is not widely used in the UK.
Specifically focusing on Brighton PEWS, Bradman in 2008 examined 424 children with any medical problem attending a paediatric ED and found the Brighton PEWS sensitivity and specificity for hospital admission to be 24% and 96%, respectively, for a score ≥4.6 In 2012, Bradman and colleagues8 prospectively studied 1223 children attending the ED. In addition to assessing Brighton PEWS validity in predicting hospital admission, they also examined triage nurse (TN) opinion on whether the child needed admission. TN opinion most accurately predicted hospital admission having a prediction accuracy of 87.7%, followed by an elevated PEWS with a prediction accuracy of 82.9%. PEWS ≥4 had poor sensitivity (14%) but good specificity (98%). This study included a relatively even split of medical, surgical and injured patients, but subgroup analyses of these groups and diagnostic severity were not examined. The recently completed Paediatric Observation Priority Score (POPS) study evaluated a novel aggregate scoring system designed for use in the ED.27 Examining 2068 patients under 16 years, the AUC for predicting hospital admission was 0.73 for medical patients and 0.69 for trauma patients, with sensitivity and specificity for a POPS ≥3 being 36% and 93%, respectively. With regard to predicting admission, our study showed similar results to the above studies, that is, good specificity but poor sensitivity. Therefore, as a screening tool for admission, the test performs increasingly poorly as the measure of illness severity increases. The worsening of sensitivity with a rising PEWS may be owing to diminishing population size in ascending PEWS categories. Unlike Egdell and Chaiyakulsil,10 ,17 our study did not examine the association with PEWS and admission to PICU from the ED because the number of PICU admissions from a single centre in a 1-month period would be too small to yield informative data.
Similar to hospital admission, we demonstrated that PEWS was specific for predicting significant illness, and for a PEWS of ≥3 suspicion of underlying significant illness should be taken seriously. Illness subgroup analysis demonstrated that PEWS performed well as a tool for detecting significant respiratory illness, with ROCs for Brighton and COAST scores yielding AUCs of 0.900 and 0.866, respectively. The robustness of this result maybe owing to this illness group having the largest population size. The timing of the year (winter) will account for the high proportion of respiratory illnesses. Interestingly, this AUC was similar to the finding by Breslin et al,28 who found that Brighton PEWS had an AUC of 0.80 for those who were admitted to hospital with respiratory illnesses, but the AUC was only 0.63 (95% CI 0.57 to 0.69) for those admitted with all other illnesses. The Breslin study calculated an ROC specifically for respiratory illness. A collective ROC was done for all other illnesses combined, and a classification of illness severity was not done. Nonetheless, it is interesting that PEWS seems to have a relationship with respiratory illness and perhaps future work should examine tailoring early warning scores to individual illnesses.
Our study has shown that two commonly used PEWS (Brighton and COAST) perform similarly. If this finding is replicated in future studies, then perhaps there is advantage in using a single type of PEWS among the paediatric patients in ED. This will help reduce confusion of having different scoring systems for different hospitals.
Study limitations include this work being a single-centre inner-city retrospective analysis of patient data in a paediatric ED over a 1 -month period during the winter where respiratory illnesses predominate. We calculated PEWS based on the data recorded at the first assessment in the ED. Therefore, the potential for evolving physiological data over time was not accounted for. Patients with missing data tended to be either relatively well with minor injuries or were judged to be very ill at triage and so taken straight to the resuscitation area. Therefore, a bias was created by excluding a number of very well or very sick patients from our study. Nonetheless, complete data were retrieved from 225 out of the 565 with missing data fields. Owing to the difficulty in retrospectively delineating parental/carer concern, we automatically assigned a PEWS of 1 to every patient on the basis that parents/carers, by bringing their child to the ED, were by default ‘concerned’ about their child. This scoring approach, however, may not be universal in clinical practice. Owing to the heterogeneity in diagnoses presenting to the ED, a classification system was devised. To classify diagnoses as minor or significant, we used existing guidelines where possible to minimise subjectivity bias. Though disease management guidelines exist for many of the other conditions, illness severity classification into significant and non-significant is not a component of the guidance as the majority of these conditions such as meningococcal sepsis, meningitis, sickle crisis and status epilepticus are by definition significant illnesses. Therefore, we classified such diseases as significant. Where grey areas existed regarding what was significant in the absence of defined guidelines, explanation is given in the online supplementary table next to the diagnosis about our definition. After completing the allocations of illness severity for the whole sample, we then randomly selected 200 patients for PL and GH to perform an analysis of investigator agreement. Though the statistical agreement was good (kappa 0.74), we acknowledge that this is still imperfect agreement and we did not extend out agreement analysis outside of the 200 random sample. Imperfect agreement could affect sensitivity/specificity especially in the surgical group where the number of patients was small. We also did not perform age subgroup analysis. All patients with missing observational data were excluded; therefore, we did not compare the ROC of PEWS calculated from complete and missing observational data.
In keeping with previous work, this study has found that a PEWS score of ≥3 is specific but not sensitive in predicting hospital admission and for the first time has found this PEWS score be specific but not sensitive in predicting significant illness in the children's ED. There was no difference in the performance of Brighton and COAST scores in all parameters studied.
The implication from this study is that a high PEWS (we propose ≥3) has few false positives and must prompt thought for hospital admission and the investigation of significant illness, but a low PEWS should not be taken to exclude significant illness or the need for admission. Those using Brighton and COAST PEWS within the ED should also note that they were not designed for use within the ED and instead were validated to detect early deterioration in patients who are hospitalised. Being that 15% of our initial sample had irretrievable missing observational data and were subsequently excluded (leading to exclusion bias), future work should include prospective multicentre studies and these should take place over a year to reduce seasonal variability in pathology and increase the heterogeneity of the study population. Age subgroup analysis should also be performed. A comparison of PEWS against TN opinion and established ED triage systems would be useful as suggested by Seiger et al,21 as should comparison with the new ED-specific POPS scoring system that has recently been demonstrated to be a better predictor for hospital admission compared with the Manchester children's early warning system.29 Correlation with patient disposition and length of stay in the hospital would also be useful to incorporate into future analysis. Finally, we propose that future work should examine tailoring early warning scores to individual illnesses based on our findings with respiratory illness.
The authors thank Dr Jill Warner and Karen Harrison White for their help in the review of the manuscript.
This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.
- Data supplement 1 - Online supplement
Funding NIHR Imperial BRC funding, P46153.
Competing interests IM is in receipt of a grant from National Institute for Health Research (NIHR) Biomedical Research Centre based at Imperial College Healthcare NHS Trust and Imperial College London.
Ethics approval The local research compliance body classified this study as audit since we were assessing established illness scoring systems within their remit, that is, to score illness severity, using anonymised retrospective data. Therefore, ethical approval was not required.
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement Unpublished raw data on PEWS in relation to age subgroups are available and in possession of the main author PJL.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.