Objectives: To determine the sensitivity and specificity of paediatric major incident triage scores. The Paediatric Triage Tape (PTT), Careflight, Simple Triage and Rapid Treatment (START), and JumpSTART systems were tested.
Methods: In total, 3461 children presenting to a South African emergency department with trauma were scored using the four different methods. The sensitivity and specificity of the four scores was calculated against the Injury Severity Score (ISS), New ISS (NISS), and a modification of the Garner criteria (a measure of need for urgent clinical intervention). We also performed a Bayesian analysis of the scores against three different types of major incident.
Results: None of the tools showed high sensitivity and specificity. Overall, the Careflight score had the best performance in terms of sensitivity and specificity. The performance of the PTT was very similar. In contrast, the JumpSTART and START scores had very low sensitivities, which meant that they failed to identify patients with serious injury, and would have missed the majority of seriously injured casualties in the models of major incidents.
Conclusion: The Careflight or PTT methods of triage should be used in paediatric major incidents in preference to the jumpSTART or START methods.
- ISS, Injury Severity Score
- NISS, New Injury Severity Score
- PTT, Paediatric Triage Tape
- START, Simple Triage and Rapid Treatment
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
- ISS, Injury Severity Score
- NISS, New Injury Severity Score
- PTT, Paediatric Triage Tape
- START, Simple Triage and Rapid Treatment
Although major incidents are relatively uncommon events,1 they can seriously test the responses of emergency medical services and hospitals.2 All major incidents are characterised by a period of time when the casualty load exceeds the available resources. It is therefore vital that medical resources are effectively directed towards those patients who are most likely to benefit. A key step in facilitating a smooth response is effective triage, which occurs in two phases. At the scene of an incident, primary triage is a rapid “once over” to quickly identify those patients in most urgent need of medical intervention and those who can wait for further assessment. Secondary triage usually occurs at the location of the incident’s main treatment centre, where time and resources allow for a more in depth triage process.
Children are commonly involved in major incidents, either as a significant proportion of the casualties or as the total patient load.3 If children are involved, a number of factors influence and complicate triage decisions. Firstly, children have different physiological norms. Such differences mean that using adult scores on children will often lead to an inappropriately high triage category.4 Secondly, there is often an emotional desire among rescuers to accord children, and especially young children, a higher priority. Both these factors may mean that resources will be directed away from more seriously injured adults (in a mixed adult/child incident) or that the score may fail to discriminate priorities at all (in a child only incident). In order to try to minimise these predictable problems, specific paediatric primary triage algorithms have been devised. These include: (a) the Paediatric Triage Tape (PTT),4 used in the UK, and parts of Europe, India, Australia, and South Africa; (b) CareFlight,5 in use in parts of Australia; (c) Simple Triage and Rapid Treatment (START),6 in use in the USA for children aged older than 8 years; and (d) JumpSTART,7 in use in the USA for children aged 1–8 years.
For practical and ethical reasons, primary triage algorithms are highly unlikely ever to be validated in real incidents. Computer modelling and major incident registries may help future work in this area although there are obvious potential problems with the validity of such data. Typically, triage algorithms have been compared against the gold standard of the Injury Severity Score (ISS),8 although some authors have suggested that the New ISS (NISS)9 may be better.10 However, the use of anatomical measures of injury such as the ISS has been questioned, as it fails to predict the requirement for medical intervention accurately.11 Neither ISS nor NISS give any indication of the requirement for medical intervention at the scene of a major incident, which must surely be the most important outcome of any primary triage score. Garner et al12 proposed the use of clinical interventions in place of ISS in the validation of adult major incident primary triage tools: the requirement for any of these interventions was taken as indicating a T1 (immediate priority) patient. These interventions are presented in table 1, and are easily modifiable to be applicable to the paediatric setting.
In this study, our aim was to determine the sensitivity and specificity of primary triage scores in the assessment of paediatric casualties.
We prospectively tested paediatric triage scores on paediatric attendees at the Trauma Unit of the Red Cross Children’s Hospital, Cape Town. This unit sees children aged up to 12 years of age and is the major tertiary referral centre for the Cape Town area, receiving approximately 9000 injured children each year.
We prospectively collected data on all attendees meeting the following criteria: age <13 years, and presentation within 12 hours of an acute injury. Physiological, anatomical, and demographic information needed to complete the different scores were collected at triage using standardised printed material (for the PTT, CareFlight, and START or JumpSTART, depending on the child’s age). All children were prospectively followed through to death or discharge, when the ISS and NISS scores were calculated. In addition, the case notes were examined for evidence of any of the modified Garner criteria.
We defined the performance of the scores against their ability to discriminate between T1 (immediate priority) and not-T1 (urgent or delayed priority). For comparison against ISS, children were considered to be seriously injured (and therefore rated as T1) if they had a total ISS >15. Children with an ISS ⩽15 were considered to be not-T1. The same cutoff was applied against the NISS. For analysis against the modified Garner criteria, the requirement for one or more of these interventions was considered an indicator that the child was T1.
The sensitivity and specificity of the PTT, Careflight, and START/JumpSTART were calculated individually against ISS, NISS and modified Garner criteria. Sensitivity reflects the proportions of those patients who are T1 who are correctly identified as T1, while specificity is the proportion of patients who are not-T1 who are correctly identified as not-T1.
To determine how the scores would perform in practice, we calculated the ability of the score to perform in three different types of major incident with varying proportions of seriously injured casualties. The principle outcome was the proportion of children correctly identified as truly T1 and truly not-T1 against falsely T1 and falsely not-T1 (that is, the accuracy of the score for each scenario). The characteristics of the hypothetical incidents were as shown below:
Incident 1: 100 paediatric casualties, 10% T1
Incident 2: 100 paediatric casualties, 30% T1
Incident 3: 100 paediatric casualties, 60% T1.
The results against the hypothetical incidents were rounded to the nearest whole number. The flowcharts for each triage methodology are available online (http://www.emjonline.com/supplemental).
In the study period, 5508 children presented to the trauma unit within 12 hours of injury. Of these, 3597 children met the entry criteria for the study, and 3461 (96%) children were enrolled. The study population was 63% male, with a median age of 7 years. JumpSTART was used to triage 2441 children (aged 1–8 years); the remaining 1020 were triaged by START methodology in accordance with the algorithms’ instructions.
Of the 3461 patients in this study, 1983 (57.3%) presented within 1 hour of injury, 2476 (61.5%) within 2 hours, and 2910 (84%) within 4 hours. There were 46 patients (1.3%) with penetrating trauma.
There were 188 children (5.4%) with an ISS of >15 and 314 (9.1%) with an NISS >15, and 312 modified Garner criteria were present in 200 (5.8%) children. For each of these three standards, the sensitivity and specificity rates for the different triage algorithms are presented in table 2.
Table 3 shows how each score performs in each type of hypothetical incident with differing proportions of seriously injured casualties. The score with the best performance in each incident is marked in bold. The JumpSTART and START methods were analysed independently and also as a 50:50 split, as they are components of the same triage system, only divided as to which age they should be applied.
We found that there are significant differences in the performance of the triage scores when analysed against a pool of patients presenting to an emergency department. Analysis of the sensitivity and specificity figures suggests that the performance of the PTT and CareFlight scores is similar, and both are better than the JumpSTART and START scores. The JumpSTART and START scores have worryingly low sensitivities when measured against anatomical injuries, resulting in identification of very few patients with serious injury; in other words, they miss the majority of serious anatomical injuries.
It is our belief that the Garner criteria are probably a better measure of score performance than the anatomical descriptors of injury. In this regard, overall performance of the CareFlight and PTT scores is better than the JumpSTART/START methodologies in all but the most severe incidents. Overall, the CareFlight score appears to be the best performing, although the difference between it and the PTT is probably clinically insignificant.
Strengths and weaknesses of the study
Our study uniquely applied a range of scores simultaneously to the same group of paediatric patients presenting with trauma. This allowed us to determine the performance of each score against interventional and anatomical criteria, and to draw direct comparisons between the scores. Our analysis against hypothetical major incidents shows how a score might actually help triage officers in the field with their triage decisions. In essence, it informs us of how well the score might discriminate between those who need immediate care and those who do not.
Our study does have some weaknesses. The regular recording of the triage score criteria over a period of months may have led to a much greater degree of familiarity with the methods than could be expected in a real incident. Our results probably therefore demonstrate the best performance that the scores could hope to achieve. While this study was designed to prospectively assess the usefulness of the primary triage algorithms, the numbers of patients classified as T1 by ISS (or NISS/modified Garner criteria) is relatively small. However, as the majority of patients from a major incident setting are likely to be minor in nature,1 the patient distribution in this study is therefore representative.
We had to modify the Garner criteria to a paediatric population but believe that the changes made are intuitive and reflect current paediatric resuscitation.13
Comparing developed world algorithms in a developing country may lead to bias in the conclusions, as the physiological parameters used by the tool may be different in that country. However, work undertaken by one of the authors14 shows that the heart rate and respiratory rate of children in the UK and South Africa may be considered the same by age. Hence, direct extrapolation of the results to USA or UK populations should be possible. It should also be remembered that we tested tools in a hospital setting, not in the prehospital environment where they would be used.
Strengths and weaknesses in relation to other studies
Many experts still consider that the ISS is the only appropriate means against which to validate triage algorithms: it has been studied extensively as a summary measure against which day to day triage tools are tested. An ISS of ⩾16 is widely regarded as indicating serious injury, and this cutoff point is used to direct patients to trauma centre care in regionalised systems such as that in the USA.15,16 The use of NISS has been suggested to be a more accurate indicator of severity of injury,10 although it has still to gain wide acceptance.
However, the ISS (and NISS) were not designed to serve as markers of resource requirement, and there is good evidence that the ISS fails to correlate with this measure.11 The NISS is likely to suffer from the same limitation, although it has not been studied in this regard. In primary triage at a major incident, severity of injury is of little relevance; rather, triage is aimed at prioritising the requirement for medical intervention. A patient with a minor head injury but an obstructed airway due to his position is of higher priority than a patient whose airway is intact, regardless of the severity of injury.
The use of clinical interventions given Garner et al12 as a marker of urgency of requirement for intervention helps to overcome the limitations of the ISS and NISS. Although they chose a limited range of interventions on which to base their analysis, their work is important in opening up this field for future research. The requirement for any of the clinical interventions that they proposed (modified slightly for children to reflect different fluid resuscitation strategies) may be used as a marker to indicate a patient who should be triaged as T1 by any triage algorithm. Although their work allows research in this field to begin to move away from the use of inappropriate scoring systems, the interventions proposed by Garner et al can still only be used to distinguish between those patients who are T1 (immediate) and those who are not. As with the use of ISS and NISS, further analysis of the ability of triage algorithms to identify T2 (urgent) and T3 (delayed) patients is impossible. Development of the use of clinical interventions as markers of T2 and T3 patients should be possible, and we are currently undertaking work in this regard.
Implications of the study
Either the Careflight or PTT should be adopted as the method of choice for the initial pre-hospital triage of paediatric patients in major incidents. Policymakers should decide which method to use, based on current knowledge, exposure, and the practicalities of each method for field use. We have not compared the practicalities or ease of use in this study. However, our experience suggests that there is little difference in terms of time to perform or training.
Unanswered questions and future research
Our study was unable to discriminate between T2 and T3 casualties, which is arguably as important as discriminating T1 casualties at the scene of an incident. In order to do this, additional criteria, such as the Garner criteria but with T2 and T3 outcomes, must be available. We are currently conducting a study to define exactly those criteria.
We have presented a comparison of the most commonly used major incident paediatric primary triage algorithms, and found that none of the tools have good sensitivity (the ability to identify seriously injured children), but all have excellent specificity (the ability to identify less seriously injured children). A more accurately designed triage algorithm is required. In the meantime, the use of START and JumpSTART for children cannot be recommended.
Competing interests: there are no competing interests.