Article Text

Download PDFPDF

Five-level emergency triage systems: variation in assessment of validity
  1. Akira Kuriyama1,2,
  2. Seigo Urushidani3,
  3. Takeo Nakayama1
  1. 1 Department of Health Informatics, Kyoto University School of Public Health, Kyoto, Japan
  2. 2 Department of General Medicine, Kurashiki Central Hospital, Kurashiki, Okayama, Japan
  3. 3 Department of Emergency Medicine, Kurashiki Central Hospital, Kurashiki, Okayama, Japan
  1. Correspondence to Dr Akira Kuriyama, Department of Health Informatics, Kyoto University School of Public Health, Yoshida-konoe-cho Sakyo-ku Kyoto 606-8501 Japan; akira.kuriyama.jpn{at}


Introduction Triage systems are scales developed to rate the degree of urgency among patients who arrive at EDs. A number of different scales are in use; however, the way in which they have been validated is inconsistent. Also, it is difficult to define a surrogate that accurately predicts urgency. This systematic review described reference standards and measures used in previous validation studies of five-level triage systems.

Methods We searched PubMed, EMBASE and CINAHL to identify studies that had assessed the validity of five-level triage systems and described the reference standards and measures applied in these studies. Studies were divided into those using criterion validity (reference standards developed by expert panels or triage systems already in use) and those using construct validity (prognosis, costs and resource use).

Results A total of 57 studies examined criterion and construct validity of 14 five-level triage systems. Criterion validity was examined by evaluating (1) agreement between the assigned degree of urgency with objective standard criteria (12 studies), (2) overtriage and undertriage (9 studies) and (3) sensitivity and specificity of triage systems (7 studies). Construct validity was examined by looking at (4) the associations between the assigned degree of urgency and measures gauged in EDs (48 studies) and (5) the associations between the assigned degree of urgency and measures gauged after hospitalisation (13 studies). Particularly, among 46 validation studies of the most commonly used triages (Canadian Triage and Acuity Scale, Emergency Severity Index and Manchester Triage System), 13 and 39 studies examined criterion and construct validity, respectively.

Conclusion Previous studies applied various reference standards and measures to validate five-level triage systems. They either created their own reference standard or used a combination of severity/resource measures.

  • triage
  • triage systems
  • emergency departments
  • reference standards
  • measures
  • validity
  • criterion validity
  • construct validity
  • severity
  • urgency
  • systematic review.

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


Triage at EDs is a decision-making process that is applied to identify patients who require immediate attention to achieve optimal outcomes.1 Triage systems are scales developed primarily to categorise patients who do and who do not need immediate intervention by urgency, and optimise resources in the ED to apply them to those who need immediate care.

Validating triage systems is essential because they can impact the outcomes of patients in need of immediate care. A challenge is to determine the appropriate reference standard and measure for validation that discriminate patients into a ‘true’ category of urgency. In validation studies of triage systems, criterion and construct validities have been mainly discussed. Criterion validity looks at the correlation of a scale with some external criterion of the disorder under study (reference standard).2 This is often the ‘gold standard’. Construct validity looks for the correlation in assessments obtained from several scales purported to measure the same construct (measures).2 3 In validating triage systems, criterion validity would be ‘true’ urgency of patients, most likely determined by experts, while construct validity represents severity-related variables such as mortality, admission and resource and time spent on patients. Criterion validity should be preferred in validation of scales, but given the lack of a convenient gold standard for ‘urgency’, Moll has suggested that the best proxy reference standard comprises prognostic markers, disease severity and case complexity.4 Furthermore, there is no consensus as to what are acceptable reference standards or measures in validating triage systems.

Therefore, we systematically reviewed the reference standards and measures used in published validation studies of triage systems to provide an understanding of the basis on which triage scales have been validated.


This study proceeded according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses statement for reporting systematic reviews.5 The protocol for this systematic review is registered in PROSPERO (CRD42015027653).

Search and selection of studies

The American College of Emergency Physicians /Emergency Nurses Association Five-Level Triage Task Force recommended the use of five-level triage scales as they generally showed better reliability compared with three-level or four-level ones.6 Travers et al showed that a five-level triage system was more reliable and discriminative than a three-level one.7 A growing trend to use the five-level triage systems was also noted.8 Among them, the Canadian Triage and Acuity Scale (CTAS), the Emergency Severity Index (ESI), the Manchester Triage System (MTS) and the Australasian Triage System (ATS) are the five-level triage systems that were most frequently studied, and we first focused on these four. We also examined other five-level triage systems separately. We required these studies to meet the following conditions: (1) they included patients who presented at EDs from all categories of triage scales, unless a pragmatic or ethical need was required to exclude some from certain urgency categories; (2) they were of any design and (3) were published in peer-reviewed journals, and (4) they were explicitly testing the validity of one or more triage scales. Studies from any age groups as well as ambulatory and transferred patients were included. Studies were excluded if they used these scales as characteristics of the participants or explanatory variables, or if they focused on a limited spectrum of diseases and symptoms or populations classified in certain urgency categories without clear rationale. We also excluded reviews, editorials, letters, conference proceedings or abstracts and studies that focused solely on inter-rater reliability.

We searched PubMed, EMBASE and CINAHL for potentially eligible studies. We designed a sensitive search strategy as follows: ‘Canadian Triage and Acuity Scale’ OR ‘Emergency Severity Index’ OR ‘Manchester Triage Scale’ OR ‘Australasian Triage Scale’. Next, we searched PubMed for studies on other five-level triage systems with the following search terms: ‘Emergency Service, Hospital’[Mesh] AND ‘Triage’[Mesh]. There were no language restrictions. The last search date was 29 February 2016.

Two authors (AK and SU) independently screened abstracts to identify potentially eligible studies. The same authors then retrieved the full texts, independently assessed the eligibility of these studies, and screened their included reference lists. Any uncertainty about the eligibility of a study was resolved through discussion with the third author (TN).

Data extraction

Two authors (AK and SU) independently extracted the study characteristics (study design, country, number of study sites and triage scale applied), participant demographics (patient category by age and sample size) and the reference standards and measures that were used to evaluate the validity of the triage scales.


The reference standards and measures to validate the triage systems and how they were used in evaluating the validity are described. As the goal was to describe the range of standards and measures used to validate triage systems, and not evaluate the accuracy of that validation, assessment of risk for bias in each original study was waived.


Description of studies

Our search for studies on the four most studied triage systems yielded 998 articles (figure 1), of which 46 met inclusion criteria9–54 (table 1). Among them, 21 assessed ESI, 14 CTAS and 14 MTS. None evaluated the validity of ATS. Seven studies compared two more scales: MTS and ESI,38 MTS, ESI and an informally structured triage system (ISS),34 CTAS and the Taiwan Triage and Acuity Scale (TTS),28 ESI and TTS,48 CTAS and ESI,41 and MTS V.1 and 2,33 MTS and a modified version of MTS.49 Twelve studies were from the USA,9 11 12 15 23 31 35–37 42 43 53 nine from Canada,10 13 18 20–22 41 47 seven from the Netherlands,32 34 38–40 49 54 three from Switzerland,24 25 50 two from Brazil,30 52 Portugal44 45 and Taiwan,28 48 respectively, and one each was from Andorra, Germany, Kuwait, Norway, Saudi Arabia, South Korea, Sweden and Spain. One was an international multicentre study that examined MTS.33 The median sample size was 1042 (range 50–550 940) patients, and a median of 1 site (range 1–12 sites) was studied. Adult, paediatric and geriatric populations were examined in 17, 20 and 4 articles, respectively. Six studies evaluated validity in both adult and paediatric populations, and 11 studies did not specify the populations. Eighteen studies were retrospective and 28 prospective observational.

Table 1

Characteristics of included studies on Australasian Triage System, Canadian Acuity and Triage Scale (CTAS), Emergency Severity Index (ESI) and Manchester Triage System (MTS)

Our search for other five-level triage systems yielded 3227 articles. Ten triage systems from 11 articles were finally included for analysis: Clinical GPS,55 Echelle Liégeoise d’Index de Sévérité à l’Admission,56 FRench Emergency Nurses Classification in Hospital scale,57 Hacettepe Emergency Triage System,58 Medical Emergency Triage and Treatment System,59 Netherlands Triage System,60 Pediatric Triage and Acuity Scale,61 Rapid Triage Score,62 Rapid Emergency Triage and Treatment System-Hospital Unit West63 and Soterion Rapid Triage System.64 65 All these systems were examined in single-centre studies, and 4 out of 11 were prospectively conducted55 58 60 62 (table 2).

Table 2

Characteristics of other five-level triage systems

Reference standards and measures

For studies using criterion validity, reference standards included degree of urgency with objective standard criteria developed by expert panels for their studies a priori, and other triage systems that had already been in use as a means of validating a triage system. For studies of construct validity, measures included patient prognosis, costs and resource use that were gauged during the ED stay and after hospitalisation (box).


Reference standards and measures used in the validation studies of five-level triage systems

Reference standards (criterion validity)

  • Objective standard criteria/urgency set by expert panels for their studies a priori

  • Existing emergency triage systems

  • Immediate life-saving interventions

Measures (construct validity)

  • Overall admissions

  • Admissions to intensive care or monitored units

  • ED length of stay

  • Costs in EDs

  • Number of resource used in EDs

  • Mortality in EDs

  • Leaving without being seen

  • Waiting times before examinations by physicians in EDs

  • Referrals to outpatients after the discharge from EDs

  • In-hospital mortality

  • Hospital length of stay

  • Costs after hospitalisation

  • Six-month survival

  • Sixty-day mortality

  • Thirty-day mortality

  • Ninety-day mortality

We identified five main outcomes for the validation of triage systems (table 3). For criterion validity, outcomes included (1) agreement of assigned degree of urgency with objective standard criteria set by expert panels for their studies a priori, (2) overtriage and undertriage and (3) sensitivity and specificity for defined reference standards. Outcomes for studies using construct validity were (1) associations between assigned degree of patient urgency, their prognosis, cost and resource use in EDs; and (2) associations between assigned degree of urgency and patient prognosis and cost after hospitalisation were examined.

Table 3

Approaches with reference standards and measures in the validation of Australasian Triage System, Canadian Triage and Acuity System (CTAS), Emergency Severity Index (ESI) and Manchester Triage Systems (MTS)

For studies on criterion validity, nine studies used a reference standard that was developed by investigators.2 14 29 32 34 39 40 46 49 Six of these were studies of MTS, and one each was of CTAS and ESI, respectively. One study evaluated MTS, ESI and ISS. The details of reference standards are shown in table 4.

Table 4

Reference standards developed by investigators for validating triage systems

Overtriage and undertriage of patients in ED were measured in eight studies (three examining MTS, two ESI, one CTAS, one comparing MTS and a modified version of MTS, and one comparing MTS, ESI and ISS).29 34 37 39 46 47 49 50 Overtriage and undertriage were defined in four simulation studies of vignettes and three prospective studies as the degree of urgency assigned by nurses being above or below that assigned by experts, respectively.29 34 39 46 47 49 50 Travers et al defined overtriage as patients who were rated ESI 1, 2 or 3 (acuity) but required <2 resources or those who were assessed as ESI 1 but who were not hospitalised, and undertriage as those who were assessed as ESI 4 or 5 (non-acuity) but received ≥2 resources or were hospitalised.37

The sensitivity and specificity of some reference standards were measured in seven studies (three respectively examining MTS and ESI, and one comparing MTS, ESI and ISS).9 31 32 34 39 40 46 47 The reference standards comprised patients receiving immediate life-saving intervention,9 31 each of five degrees of urgency,34 and high and low degrees of urgency32 39 40 46 that were predefined by investigators.

For studies on construct validity, association between assigned degree of patient urgency and one or more measures gauged during the ED stay and after hospitalisation was examined. Admissions from the ED were most commonly measured, with 33 studies using this outcome.9 11–15 17–28 32–34 36–38 42–46 48 51 Other measures gauged during the ED stay included admissions to intensive care or monitored units,19 20 22 24 26 27 35 50 mortality in the ED (n=9),26 27 36 38 42–45 50 patients leaving without being attended,16 20 46 duration of wait before being examined by a physician, referrals to outpatients after discharge from the ED, Worthing Physiological Scoring System scores,54 length of stay (LOS) in the ED11–13 15 16 19 20 22–26 28 35 37 42 43 46 48 and costs consumed in EDs. The measure in 17 studies was the amount of resources used in EDs.9 11 13 15 17 21–25 32 34 35 37 41 42 46 51 A few of these studies specifically counted specialist consultations, necessity for monitoring, diagnostic procedures (electrocardiography, laboratory examinations, diagnostic imaging, blood cultures and invasive diagnostic tests)44 and treatment (intravenous fluids, transfusions, mechanical ventilation, inhalers and life-saving interventions). Measures gauged after admission included in-hospital mortality rates,19 24 25 27 36 41 52 hospital LOS,19 24 27 35 50 52 30-day50 and 6-month survival,53 and costs consumed after admission.

Other five-level triage systems

Reference standards and measures used to validate triage systems were similar to those for the four triage systems (table 5). Three and nine studies used criterion and construct validity, respectively. Of note, some studies used other triage systems (CTAS and ESI) that had already been in use, as the reference standards.55 58 62

Table 5

Approaches with reference standards and measures in the validation of other five-level triage systems


We found that, of 57 validation studies, a variety of reference standards and measures were used. Overall, construct validity (51 studies) was more frequently examined than criterion validity (14 studies). Particularly, validation studies of ESI (5 of 21 studies) and MTS (7 of 14 studies) more frequently used a form of criterion validity compared with CTAS (2 of 14 studies). Validation studies of these three triages commonly used some construct validity; CTAS (13 studies), ESI (20 studies) and MTS (10 studies).

Criterion validity should be preferred in validation of scales. In validating triage systems, true ‘urgency’ of patients should serve as reference standards, because triage systems rank the speed of care for a patient, or namely, urgency. However, our study found that many validation studies focused on severity or construct validity, likely due to the lack of established criterion validity in triage system research.66 For criterion validity, reference standards were mostly the studies’ own criteria, which were either urgency alone or a combination of urgency and severity, determined by investigators. Other triage systems that are already in use could also serve as reference standards for a newly introduced triage system, which reduces the variability of reference standards.28 34

Triage scales could be considered a type of decision rule, in which the goal of the rule is to predict the need for immediate care. However, unlike decision rules, they are consensus-based and lack the typical multistep data gathering and statistical processes resulting in derivation, and then prospective validation on a unique population. Moll suggested a four-step approach to validate a triage system4: consensus-based derivation of decision rules for different degrees of urgency, validation of a system with a reference standard as the best proxy for prognosis in a single setting (internal validation), modification of triage rules and validation in various emergency care settings (external validation). For studies evaluating criterion validity, most reference standards or criteria developed by the investigators for their validation studies followed Moll’s framework of reference standards or the best ‘proxy’ based on the information of urgency and severity. We speculate that the investigators of studies assessing criterion validity needed to establish reference standards based on both urgency and severity because urgency is hard to determine or predict based on a limited amount of information gained during the triage.

All validation studies are currently subject to the limitations described above due to the absence of perfect standard references of urgency. Clinicians need to know that the validity of triage systems has not been perfectly determined and their weaknesses remain obscure. Bearing this in mind, clinicians need to triage patients using the available triage systems.

The present study has some limitations. First, it was designed simply to review and describe the methodologies used in validation studies of emergency triage systems without the intent to suggest the most appropriate reference standard or measure for a validation study. Second, our study focused on five-level triage systems. We might have missed other methodology as well as reference standards and measures used to examine other triage systems. Despite these limitations, we summarised the reference standards and measures used in validation studies of triage systems, and described their drawbacks and advantages. This information should provide an important rationale for future validation studies of triage systems.


The most commonly used triage systems have been validated using both criterion and construct validity of emergency triage systems. The difficulty in defining a surrogate for urgency means that studies must either create their own reference standard (often an expert panel) or use a combination of severity/resource measures which approximate but are not the same as urgency. Given that the limitations of validation studies are not completely understood and given the potential flaws of triage systems, future studies should attempt to elucidate the weaknesses of triage systems in terms of presenting signs and symptoms and the characteristics of patients.

Supplementary Material

Supplementary Figure 1



  • Contributors AK and TN conceived the study design and interpreted the data. AK and SU acquired the data. AK analysed the data and drafted the manuscript. All authors critically revised and approved the submission of the manuscript.

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.

Linked Articles