Aim To present a systematic review on the validity of triage systems for paediatric emergency care.
Methods Search in MEDLINE, Cochrane Library, Latin American and Caribbean Health Sciences Literature (LILACS), Scientific Electronic Library Online (SciELO), Nursing Database Index (BDENF) and Spanish Health Sciences Bibliographic Index (IBECS) for articles in English, French, Portuguese or Spanish with no time limit. Validity studies of five-level triage systems for patients 0–18 years old were included. Two reviewers performed data extraction and quality assessment as recommended by PRISMA statement.
Results We found 25 studies on seven triage systems: Manchester Triage System (MTS); paediatric version of Canadian Triage and Acuity Scale (PedCTAS) and its adaptation for Taiwan (paediatric version of the Taiwan Triage and Acuity System); Emergency Severity Index version 4 (ESI v.4); Soterion Rapid Triage System and South African Triage Scale and its adaptation for Bostwana (Princess Marina Triage Scale). Only studies on the MTS used a reference standard for urgency, while all systems were evaluated using a proxy outcome for urgency such as admission. Over half of all studies were low quality. The MTS, PedCTAS and ESI v.4 presented the largest number of moderate and high quality studies. The three tools performed better in their countries or near them, showing a consistent association with hospitalisation and resource utilisation. Studies of all three tools found that patients at the lowest urgency levels were hospitalised, reflecting undertriage.
Conclusions There is some evidence to corroborate the validity of the MTS, PedCTAS and ESI v.4 for paediatric emergency care in their own countries or near them. Efforts to improve the sensitivity and to minimise the undertriage rates should continue. Cross-cultural adaptation is necessary when adopting these triage systems in other countries.
- emergency department
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
The purpose of an emergency triage system is to establish a safe and effective hierarchy of care, based on clinical risk, by prioritising the more urgent cases.1 Paediatric triage is a complex task, which presents many challenges to the triage team due to communication difficulties with young children and their parents and high variability over a wide range of factors within each age group, such as physiological parameters, epidemiology and clinical presentation of various diseases.2
The most widely used triage systems are the Australasian Triage Scale (ATS),3 4 the Canadian Triage and Acuity Scale (CTAS),5 6 the Manchester Triage System (MTS)7 and the Emergency Severity Index (ESI)8 9 developed in USA. They are complex five-level triage systems, which have demonstrated better validity when compared with three-level systems.10 Studies on these triage systems have been performed predominantly in their respective home countries and in the adult population. The South African Triage Scale (SATS) is a more recent and less complex scale, developed in an emergent country, but there are few studies on its effectiveness, particularly in the paediatric population.11 12
According to the American College of Emergency Physicians (ACEP) and the Emergency Nurses Association (ENA), the ideal triage scale must demonstrate the characteristics of reliability, validity, utility and relevance.13 The triage process must be easily understood, rapidly applied, have high rates of inter-observer agreement, facilitate appropriate placement, correlate with ED resource use requirements and predict clinical outcomes, including severity of illness and mortality rate.13
The validity of triage systems depends on their ability to discriminate different levels of urgency. Criterion validity, that is, comparison with a reference standard, which is the preferred method for validating diagnostic tests, is a challenge as there is no reference standard for ‘urgency’.14 The literature evaluating the validity of triage systems has relied on one of the two methods: (1) the comparison of the performance of the triage system with a reference standard developed by experts (an approximation of the criterion validity) (see online supplementary files 1A, 1B); (2) the association of levels of urgency with outcomes proxy variables of urgency, mainly hospitalisation, resource utilisation and length of stay in the ED.15 The expert-developed reference standard uses data such as clinical picture and vital signs at presentation and outcomes related to diagnostic tests performed, treatment received and patient’s destination, to determine retrospectively the ‘true’ urgency level of the patient, to be compared with the level assigned by the triage system. Several combinations of outcomes may be associated to some levels of urgency.16
Supplementary file 1
Recent reviews suggest that there are many gaps regarding the validity of triage systems, particularly in the paediatric population.15 17–20 The aim of this study was to perform a systematic review on the validity of triage systems for paediatric emergency care, assessed by either an expert developed reference standard or the association with proxy outcomes.
Search of literature
From July 2014 to September 2015, we searched for original articles, systematic reviews, government and medical society documents in several databases (MEDLINE, Cochrane Library, Latin American and Caribbean Health Sciences Literature (LILACS), Scientific Electronic Library Online (SciELO), Nursing Database Index (BDENF), Spanish Health Sciences Bibliographic Index (IBECS)), in the reference lists of selected articles, in Google and Google Scholar. The search included articles published in English, French, Spanish or Portuguese with no time restriction. The concepts used were emergency department, child, triage and validity or reliability, according to the PICO strategy (PRISMA guidelines).21 We added the name of each triage system found in the first step to broaden the search (online supplementary file 2A).
Selection of studies
Two reviewers (MCMB and APB) performed the selection of articles, based on the inclusion and exclusion criteria. Both researchers are paediatricians with extensive experience in paediatric emergency care. Instances of disagreement were discussed to meet consensus.
Inclusion and exclusion criteria
We initially selected original articles on the validity and reliability of five-level triage instruments applied to general paediatric population or paediatric subgroups, aged 0–18 years old, who were triaged in hospital ED. Because of size limitations, we split the resultant material into two groups: studies with validity assessment to be included in the present review and studies with reliability assessment to be included in another review. For this review on validity, we included prospective or retrospective studies with two different designs: (1) those comparing levels of urgency assigned by triage systems to a reference standard developed by experts; (2) those assessing the association of urgency levels with outcomes proxies of urgency, such as resource utilisation, hospitalisation, admission to the paediatric intensive care unit, ED length of stay (LOS) or severe bacterial infection.
We excluded studies of prehospital care, mass casualty events or telephone triage.
Data extraction and quality assessment
Two reviewers (MCMB and JRR), both with a PhD in epidemiology, independently performed data extraction and elaborated a list of items by consensus to evaluate the methodological quality of the articles (see online supplementary file 2B). The items were based on three tools: Quality Assessment of Diagnostic Accuracy Studies (QUADAS),22 Statement for Reporting of Diagnostic Accuracy (STARD)23 and an instrument developed by Hayden et al for prognostic studies.24 The two reviewers independently classified the risk of bias related to participants, attrition, measurement, outcome and statistical analysis, into high, uncertain or low categories, as recommended by QUADAS. They also rated the quality of the discussion section as good, moderate or poor. Agreement between the two reviewers for each type of bias and the discussion assessment was estimated by the quadratic-weighted kappa (kw2). Instances of disagreement were solved by consensus. We classified the methodological quality in each study as low, moderate or high according to the amount of risk of bias.
In this review, we did not try to pool data from studies of the same triage systems, because there was great heterogeneity in the sampling, methods of validation and definitions of outcome variables.
We used the statistical software Stata V.12.0 (Stata, Texas, USA).
The search strategy located 25 articles on five original and two adapted triage systems for paediatric emergency care: MTS (n=9); the paediatric version of CTAS (PedCTAS) (n=8); the pediatric version of Taiwan Triage System (PedTTAS), an adaptation of the PedCTAS (n=1); ESI (n=4); Soterion Rapid Triage System (SRTS) (n=1); SATS (n=1) and Princess Marina Triage Scale (PATS), an adaptation of the SATS (n=1) (see online supplementary file 2A, figure 1, table 1).
Agreement between the two reviewers for the risk of the various types of bias varied from a kw2 of 0.848 (95% CI 0.722 to 0.965) to 0.573 (95% CI 0.242 to 0.667) (see online supplementary file 2C). The validity assessments were rated high quality in two articles, moderate quality in 11 and low quality in 12 (table 2).
There were 14 retrospective and 11 prospective observational studies assessing validity (table 1). Most of them used proportional sampling, including the five levels of urgency with the same frequency as they occurred in the source population. Two studies used disproportionate sampling to ensure a minimum number of patients in the most urgent levels or the same number of patients in all levels of urgency (table 3). The detailed characteristics of these studies are summarised in the online supplementary file 2D.
Table 3 includes only studies that used the reference standard method and reported estimates of sensitivity, specificity, overtriage and undertriage rates. Table 4 presents results for those studies that used proxy outcomes giving estimates of sensitivity, specificity, overtriage and undertriage (for the proxy outcome), while table 5 looks at studies using proxy outcomes but presents the frequency of clinical outcomes in each level of urgency. Therefore, two studies were included twice: Roukema et al 16 used the two methods of validation (tables 3 and 4) and Travers et al (2009) reported two types of estimates (tables 3 and 5).
Five studies performed in Netherlands compared the MTS to an expert developed reference standard to assess the ability of the MTS to detect high urgency cases (levels 1 and 2) (table 3). The original MTS presented a moderate sensitivity of 63% and high overtriage rates (40%–54%).16 37 A modified version of the MTS increased the specificity from 78%–79% to 87%, but not the sensitivity (64%), resulting in the reduction of overtriage (47%) without a parallel increase in undertriage (15%).39 The other two studies assessed subgroups of patients and showed an undertriage rate of 2% in patients levels 1 and 238 and poorer sensitivity (58% vs 74%) and higher undertriage rate (17% vs 11%) in patients with symptoms of infection and chronic disease compared with those without chronic disease.42
Twenty-one studies assessed the association of levels of urgency with one or more proxy outcomes of urgency (tables 4 and 5). Seven of these studies assessed one outcome, such as hospitalisation or severe bacterial infection (one study) and reported sensitivity, specificity, over/undertriage rates or other estimates to predict the outcome (table 4). These results could not be compared with the results of the MTS studies in table 3, because the definitions they are based on were completely different.
Fifteen of those 21 studies assessed the frequency of at least two of the following three outcomes (hospitalisation rates, resource utilisation and LOS) across the five-triage levels (table 5). In the first nine validity studies in table 5, triage systems were assessed in the countries where they were developed; in the last six validity studies, triage systems were assessed outside their own countries. Level 1 (immediate urgency) represented less than 1% of the visits, while levels 3 (urgent) and 4 (low urgency) together contributed around 70%–90%, in most of the studies which used proportionate sampling. The distribution of urgency levels was more similar in the Canadian and US studies compared with studies in other countries. In most of these studies, the frequency of hospitalisation decreased from the higher to the lower level of urgency. The decreasing gradient was more evident in studies performed in the countries where the triage systems were developed, such as Canada (PedCTAS) and US (ESI-4). The combined frequency of hospital admission in levels 4 and 5 with the PedCTAS varied from 2.6% to 4% in Canadian studies and from 1.5% to 25% in other countries. In the ESI v.4 studies, it varied from 1.8% to 6% in US studies and was 3.3% in the only study performed in Iran. The two MTS studies showed a combined hospital admission rate in levels 4 and 5 of 0.9% in Netherlands and 5% in Spain.
Despite different definitions and cut-off points used for the outcome ‘resource utilisation’ (diagnostic and therapeutic resources or only laboratory or radiological tests or hospital costs), most studies showed decreasing frequency from the highest to the lowest level of urgency. Again, this gradient was clearer in the Canadian and US studies (table 5).
The outcome LOS did not show a consistently decreasing gradient across the five levels of urgency. The LOS for level 1 was shorter than for level 2 in two PedCTAS studies in Canada and two ESI v.4 studies in USA, one of which also had a shorter LOS for level 2 than for level 3 (table 5).
The present review found 25 validity studies on five original and two adapted triage systems in paediatric emergency care. Five studies (all MTS) used an expert-developed reference standard, while 21 studies involving all triage systems (MTS, PedCTAS, TTAS, ESI v.4, SATS, PATS and SRTS) used proxy outcomes of urgency. The MTS, the PedCTAS and the ESI v.4 were the triage systems with the largest number of moderate or high quality studies. There were no studies on the validity of the ATS in the paediatric population and very few and low quality studies of the SATS and the STRS.
The use of a reference standard seems to be advantageous, because as it establishes objective criteria to define each level of urgency, it ensures a more robust assessment of the ‘true’ urgency of patients and more consistent comparison between studies. However, there is no scientific evidence of the validity of this reference standard.14 Besides, the criteria used to define the levels of urgency include clinical outcomes similar to the proxy outcomes of urgency used in the other type of study. The difference is that, with a reference standard, some levels of urgency are defined by many possibilities of combinations of these outcomes. These outcomes are satisfactory markers of complexity and severity, but they do not always account for all sets of urgency. Moreover, they can be influenced by variables related to the quality and efficacy of the treatment given.14 15 Twomey et al suggested a Delphi process to achieve consensus among experts to serve as a reference standard in validity studies. This could eliminate the limitations associated with proxies of urgency and the biases inherent to individual groups of specialists.14
The evidence on the validity of triage systems in paediatric patients is better for the PedCTAS, the ESI v.4 and the MTS, but remains insufficient. The three triage systems showed unacceptably high rates of hospital admission in the less urgent levels in several studies, suggesting undertriage. The MTS is the most intensively studied triage system in the paediatric population, but the sensitivity to detect high urgency was modest, despite an elevated overtriage and a moderate undertriage rate. Although there are no recommendations about the safe limits of sensitivity, undertriage and overtriage rates for emergency triage systems, an effective screening tool is expected to prioritise sensitivity and a low undertriage rate. On the other hand, a high overtriage rate might affect the flow of patients, ultimately compromising the care of the most urgent patients. An ideal triage system must balance between safety and accuracy.9 These findings raise questions about the safety of triage systems, especially if used to divert the least urgent patients to outpatient care.
Although the PedCTAS and ESI v.4 consistently predicted hospital admission and resource utilisation in the countries where they were developed, the performances in other countries such as Spain, Iran and Taiwan were lower. This can be detected in table 5, where the frequency of the five levels of urgency in the study populations of the Canadian and US studies were very similar, while in other countries these frequencies were more heterogeneous. We could not determine if this heterogeneity was due to actual differences in the characteristics of the study populations or differences in the knowledge and training of the healthcare professionals, which may have contributed to misclassification and lower performance of the instruments in those countries. Indeed, the low methodological quality of some studies may have accounted for the inconsistency of the outcomes observed. Therefore, caution is necessary when applying inferences from studies performed in the countries where the triage systems were developed to other countries with highly diverse healthcare contexts. A myriad of factors, including the morbidity and mortality features of the target population, the quality and amount of technical and human resources, the professional training and skills, sociocultural factors and health policies, among others may play a role.14
ED LOS did not consistently decrease across the five levels of urgency in the six studies that analysed this outcome. This should not be surprising as less urgent patients have longer waiting times,13 while it is important to promptly admit the most urgent patients. Furthermore, mean LOS may be distorted by aberrant values. The use of this outcome should be avoided in future studies.
This review has some limitations. First, almost half of the studies were rated low quality, especially those performed in countries distant from the country where the triage systems were developed. However, as the main triage systems have been adopted in many countries around the world, these studies were purposely included to give an idea of the amount and quality of evidence of the validity of each triage system, and how safe it is to generalise the results to other countries. Second, there was great heterogeneity among the studies, even those that used the same method of validation. Different cut-off points and definitions were used to assess outcomes such as hospitalisation, resource utilisation, ED LOS, sensitivity, specificity, undertriage and overtriage rates. These differences precluded pooling of the data to give the reader some summarise statistics as well as comparing the performance among different triage systems. Third, we could not include the reliability assessment of triage systems because the review would be too extensive. Good reproducibility with high interobserver reliability reinforces the validity of the instruments.13
In conclusion, there is some evidence to corroborate the validity of the MTS, PedCTAS and ESI v.4 for paediatric emergency care, particularly in or near the countries where these instruments were developed. However, further efforts are needed to decrease the undertriage rates in the three tools to ensure safety. Diligent cross-cultural adaptation and rigorous training followed by local validity and reliability studies are necessary when adopting these triage systems for paediatric emergency care in countries with different socioeconomic and cultural context. Finally, consensus on the best methods and outcome definitions for validity studies of triage systems among experts from different countries would be very useful to enable comparison of results.
Supplementary file 2
Contributors MCMB conceived and designed the study, contributed to obtain research funding, participated as the first reviewer of the search of the literature, the selection of the articles and the extraction and analysis of data, drafted the initial manuscript and approved the final manuscript as submitted. APB obtained research funding, participated as a second reviewer in the selection of the articles, reviewed and revised the manuscript and approved the final manuscript as submitted. JRR contributed to obtain funding, participated as a second reviewer in the extraction of data, in the design of the quality assessment instruments, in the assessment of the risk of bias of the selected articles, and approved the final manuscript as submitted. CdSL supervised the conduct of the study, reviewed and revised the manuscript and approved the final manuscript as submitted.
Funding Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq) – 448855/2014-3; Fundação Carlos Chagas Filho de Amparo à Pesquisa do Estado do Rio de Janeiro (FAPERJ) – E-26-200.991-2015.
Competing interests None declared.
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement Data extraction files and spreadsheets with quality assessment of the articles analysed may be accessible by request to the corresponding author (firstname.lastname@example.org).