Objectives To measure the reliability and predictive validity of a four-level triage system (I-4L).
Methods This observational study was conducted in an urban hospital. Five nurses were randomly selected to assign a triage level to 246 paper scenarios, using the I-4L model. The I-4L model is a four-level triage system: urgency category (UC) 1 requires immediate response; UCs 2, 3 and 4 require assessment within 20, 60 and 120 min, respectively. Weighted κ statistics were used to measure the inter-rater and intrarater reliability of the triage tool and the validity of the model was assessed based on the accuracy in predicting admission and in predicting a reference standard's triage code.
Results The I-4L model's inter-rater reliability was κ=0.73 (95% CI 0.67 to 0.79), and the intrarater reliability was κ=0.82 (95% CI 0.67 to 0.96). Its accuracy of triage rating for admission and for prediction of a reference standard's triage code was good: 79% (95% CI 73% to 86%) and 93% (95% CI 89% to 96%), respectively. The percentages of patients admitted per triage level using the I-4L model was: 100% UC 1; 42% UC 2; 6% UC 3; and 2% UC 4.
Conclusions The I-4L triage model shows a good inter-rater and intrarater reliability for rating triage acuity and for accuracy in patient admission and prediction of a reference standard's triage code.
- Clinical assessment
- emergency care systems
- emergency department
- Italian emergency triage
- predictive validity
Statistics from Altmetric.com
- Clinical assessment
- emergency care systems
- emergency department
- Italian emergency triage
- predictive validity
Triage is the first assessment and sorting process used to prioritise patients arriving in the emergency department (ED). The most common triage systems are traffic director, spot-check and comprehensive triage.1 Most current triage tools are based on a categorical measurement acuity scale and are of three four or five levels. The Cape Triage Score (CTS)2 is a four-level triage system. The Australasian Triage Scale,3 the Canadian Triage and Acuity Scale (CTAS),4 the Manchester Triage System (MTS)5 and the Emergency Severity Index (ESI)6–9 are all five-level triage tools. Italian guidelines require a four-level in-hospital triage based on an acuity scale measurement.10 Consequently, we devised a four-level triage system (I-4L) based on 23 flowcharts (contained in a 70-page manual) depending on the patient's complaint. The I-4L triage system has been used in our ED since 2001 but it had not been validated before our study.
Health measurement tools should be valid and reliable.11 To our knowledge, there are no data regarding the validity and reliability of Italian four-level triage systems and very few studies assessing these characteristics in other triage tools.6–8 12–15
The aim of our study was to measure the reliability and predictive validity of a the I-4L triage system used in our ED.
Study design and setting
This observational study was performed at a large urban medical centre with ∼65 000 ED visits annually, and an overall ED hospital admission rate of 14%.
In our ED, 15 nurses carry out a comprehensive triage using the I-4L system developed by our Triage Working Group based on Italian guidelines.10 The I-4L has four urgency categories (UCs): UC 1, immediate response; UCs 2, 3 and 4, assessment within 20, 60 and 120 min, respectively.
We created paper triage scenarios with the medical records of patients admitted to our ED during 2 weeks in October 2006. We recorded the following data for 252 patients (18 randomly selected patients each day): demographic and clinical characteristics, nurse triage category, admission status and site, and the data on triage forms completed by the nurse (presenting complaint, mode and time of arrival, past diseases, vital signs and pain score). Each case included the patient's age and gender, presenting complaint, a brief case scenario with mode and time of arrival, past diseases, vital signs and pain score.
Exclusion criteria were: (1) incomplete demographic and clinical data in the triage scenarios (six scenarios were excluded, thus leaving 246 triage scenarios); and (2) absence of code assignment by the nurses (no code assignment was missing). Thus, 246 triage scenarios were included in the final analysis.
Five nurses from our ED were assigned to undergo a 5 h refresher course in the I-4L. They were selected by their managers from among nurses willing to participate in the project. According to previous studies,8 16 a panel of three triage experts—two nurses and one senior clinician, who had emergency teaching triage certification and >15 years emergency nursing and care experience—independently assigned triage scores to the 246 scenarios. They used the I-4L method and participated in refresher training. They were also blinded to the triage category assigned by the original triage nurse and by the nurses involved in this study. Their triage scores were the reference standard for the triage level in this study.
The nurses enrolled in the study completed a questionnaire related to their demographics, education and work experience.
After completion of the refresher course, each nurse independently assigned triage scores to the 246 scenarios, at time zero and 6 months later. To prevent communication between participants, the group assigned triage codes on the same day, with each nurse in a different room, and in the presence of the investigators. The triage scenarios were given randomly to the participants. They could consult the I-4L triage methods (the manual for I-4L) and they had a maximum of 3 h for the rating. The assignment of triage codes was repeated 6 months later, in the same way, without a refresher course. The data were collected and entered into a spreadsheet by an investigator who was blind to the aim of the study. The nurse group remained concealed during data entry and analysis.
The triage scores of the panel of triage experts was the reference standard for the triage level in this study. We tested the inter-rater reliability in the panel of triage experts measuring the weighted κ (K).
We calculated inter-rater and intrarater reliability in the group of nurses and assessed the validity of the triage model. Reliability was measured with K by comparing the triage nurses' rating (inter-rater) at time 0 and after 6 months (intrarater). We also measured the inter-rater reliability between the group and its reference standard by measuring the K value among the mode of the urgency category assigned by the nurses of the group and the mode of scores assigned to scenarios by the triage expert panel, our reference standard.
Moreover we calculated the I- 4L's sensitivity, specificity and accuracy to predict the reference standard's triage score.
To analyse the predictive validity for patient admission and for the reference standards triage score, for each scenario we considered the mode of the UC assigned by the nurses and we used this code in all validity calculations. We evaluated the validity of the I-4L triage system by calculating sensitivity and specificity for prediction of patient admission and of the reference standard's triage score, using the following cut-offs: true codes 1 and 2=patient sick and likely to be admitted; true codes 3 and 4=less urgency and patient likely to be discharged. We calculated sample size according to Worster et al,13 anticipating a K value of ∼0.8 from previous studies and an SE of 0.05. Statistical significance was tested at an α level=0.05. We used the STATA v 9.2. software (StataCorp, College Station, Austin, Texas, USA) for statistical analysis. Being a quality assurance investigation, the study was exempt from formal review. The patients and nurses involved in the study gave permission to access their data.
Of the 246 patients included in triage scenarios, 116 (47%) were women and the mean age was 43.7 years (SD ±26.3). The most frequent main symptom was abdominal pain (25/246; 10%). Thirty-seven hospital admissions were recorded: 34 in non-intensive care wards and three in intensive care units. The median number of years in nursing practice was 15 (range: 3–15) with a median of 3 years experience in the ED (range: 1–6) and a median of 3 years experience in ED triage (range: 1–6).
The UCs assigned to each scenario are shown in figure 1. A complete disagreement (when nurses of the same group assigned to the same scenario triage codes that differed by more than two priority levels) occurred in 3% of scenarios evaluated with I-4L and a complete agreement (when all five nurses assigned the same triage code) occurred in 52%. The complete agreement was better in the UC 1 (80%) and UC 2 (69%) triage level compared with UC 4 (31%) and UC 3 (50%) (figure 1). Inter-rater reliability among nurses using I-4L was K=0.73 (95% CI 0.67 to 0.79), and intrarater reliability was K=0.82 (95% CI 0.67 to 0.96), respectively. Inter-rater reliability among nurses using I-4L and the triage score of their reference standard was K=0.76 (95% CI 0.63 to 0.89).
Inter-rater reliability among the panel of triage experts was good: K=0.79 (95% CI 0.66 to 0.92).
Sensitivity, specificity and accuracy in predicting the reference standard's code was good (table 1). There were no in-hospital deaths among the patients used in the triage scenarios. The rate of hospital admission (evaluated using the triage codes) with respect to each level was: 100% for UC 1, 42% for UC 2, 6% for UC 3 and 2% for UC 4.
Our triage system, in this study, seems to have a good inter-rater and intrarater reliability for rating triage acuity and for accuracy in predicting patient admission and a reference standard's triage code.
Many studies have evaluated the reliability and validity of acuity ratings by triage nurses,6–8 13–20 probably because a triage scale should meet at least these two criteria to perform accurately as intended.11 21 The inter-rater and intrarater reliability of three-level triage systems has been found to be poor.12 16 22 However, to our knowledge, there is a lack of data on the reliability and validity of four-level triage systems.
In our study, the high inter-rater reliability score for I-4L (K=0.73) was similar to the performance of other five-level triage systems, namely K=0.8 for CTAS19 and κ=0.76 for ESI.7 The I-4L triage system has a good inter-rater reliability with its reference standard.
To our knowledge, ours is the first study that measures the intra-rater reliability of a four-level triage system. The lack of previous data could be caused by the high level of difficulty involved in testing this feature: the same nurse, over time, will rate the same patient with the same acuity level. This is why we used paper scenarios several times. Our data also support the validity of our triage score. In fact, the rate of hospital admissions increased in relation to higher acuity ratings (figure 2), and our tool had a high accuracy in predicting hospital admission: positive predictive value 96%.
Moreover, the group of nurses who used the I-4L triage system proved accurate in predicting the reference standard's triage code (93% CI 89 to 96). Few studies have used a reference standard to test the validity of a triage system.
It is difficult to compare our results on validity with previous studies because of the differences in the setting and in the type of triage system (five levels vs four levels). Nevertheless, our results on the validity of the I-4L triage system are similar to previous studies on ESI v4.7 8 16
Our triage tool has one limitation: it could be difficult to learn, consult and teach. In fact it is complex: it is based on 23 flowcharts (contained in a 70-page manual) depending on the patient's complaint. Our triage course requires 2 days to teach.
The main limitation of our study is that it was conducted with paper scenarios and not with patients. However, this procedure has been used and validated in other studies on inter-rater reliability of triage tools.13 15 19 20 Another limitation of our study is that we cannot exclude the possibility that the performance of the I-4L method was overestimated because of the nurses' previous experience. Lastly, we evaluated the validity of the triage system based on the accuracy in predicting hospital admission, and hospitalisation rates may vary due to factors other than patients' acuity. The hospital admission rates is not the best outcome to test predictive validity of triage tools because it is a surrogate outcome and there are many confounding variables that could affect it.11 However, it is very difficult to establish validity criteria for triage acuity classification in the absence of a clear reference standard. For this reason we tried to develop a surrogate ‘gold standard’ based on a panel triage expert consensus and we tested the predictive validity of our triage system against this gold standard.
To our knowledge, this is the first study that measures the reliability and predictive validity of an Italian triage system in the ED of an urban hospital . It is also one of the few studies which tests the intrarater reliability for rating triage acuity in a four-level triage system. Our data suggest that the four-level triage model (I-4L) has good inter-rater and intrarater reliability for rating triage acuity. It is also accurate in predicting patient admission and a reference standard's triage code.
We thank all the nurses of the Imola Triage Working Group: R Manfredi., U Martini, S Grillini, C Liverani, A Zardi, G Zaza, G Minguzzi, I Dall'Osso, A D'Arrigo, L Monduzzi and R Lauriola.
Competing interests None.
Patient consent Obtained.
Provenance and peer review Not commissioned; externally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.