Article Text

Download PDFPDF

Emergency department crowding: prioritising quantified crowding measures using a Delphi study
  1. Kathleen Beniuk1,
  2. Adrian A Boyle2,
  3. P John Clarkson1
  1. 1Engineering Design Centre, Department of Engineering, University of Cambridge, Cambridge, UK
  2. 2Emergency Department, Addenbrooke's Hospital Cambridge University Hospitals NHS Foundation Trust, Cambridge, Cambridgeshire, UK
  1. Correspondence to Ms K Beniuk, Engineering Design Centre, Department of Engineering, University of Cambridge, Trumpington Street, Cambridge CB2 1PZ, UK; kb424{at}


Aims Emergency department (ED) crowding has been associated with a number of negative health outcomes, including unnecessary deaths, increased waiting times and a decrease in care quality. Despite the seriousness of this issue, there is little agreement on appropriate crowding measures to assess crowding effects on ED operations. The objective of this study was to prioritise a list of quantified crowding measures that would assess the current state of a department.

Methods A three round Delphi study was conducted via email and an Internet based survey tool. The panel consisted of 40 professionals who had exposure to and expertise in crowding. Participants submitted quantified crowding measures which, through three rounds, were evaluated and ranked to assess participant agreement for inclusion.

Results The panel identified 27 measures of which eight (29.6%) reached consensus at the end of the study. These measures comprised: (1) ability of ambulances to offload; (2) patients who leave without being seen or treated; (3) time until triage; (4) ED occupancy rate; (5) patients' total length of stay in the ED; (6) time to see a physician; (7) ED boarding time; and (8) number of patients boarding in the ED.

Conclusions This study resulted in the identification of eight quantified crowding measures, which present a comprehensive view of how crowding is affecting ED operations, and highlighted areas of concern. These quantified measures have the potential to make a considerable contribution to decision making by ED management and to provide a basis for learning across different departments.

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


Emergency department crowding (EDC) is a serious problem that affects hospitals all over the world. Crowding has been associated with a number of negative health outcomes, including unnecessary deaths, ambulance diversions, increased waiting times, a decrease in care quality and the delayed provision of crucial medical and nursing care.1–3 In the past decade, crowding has become an increasing priority, with many researchers striving to answer the following questions: what is crowding and how can it be measured?

Many definitions of crowding have been proposed in the literature but there is a general lack of agreement as to what constitutes crowding.3–7 In an effort to measure crowding, the literature is divided by two distinctly different approaches. The first approach has been the development of crowding measurement scales. The most notable are the Emergency Department Work Index (EDWIN), the National Emergency Department Overcrowding Score (NEDOCS) and the Emergency Department Crowding Score.3 ,6 ,7 Crowding measurement scales are designed to calculate if an ED is crowded or not but they do not measure the effects of crowding on the operation of the department,3 ,6 ,7 and there are concerns over their transferability across diverse settings.8 The second approach has been the identification of crowding measures. Crowding measures are process measures which provide real time observation of the operation of the department. Many crowding measures have been identified in the literature but few are clearly defined, and even less are quantifiable.4 ,6 ,8–10

It seems apparent that to progress the literature, there exists a need to identify a set of crowding measures that will examine the state of a department, evaluate how the department is coping with the current demands and highlight areas of concern. This is observed by Hwang et al in a recent paper: “there is growing consensus of the need for quantitative, objective crowding measures that can be used across multiple sites and that are feasible and reproducible”.8 Not only will measures make considerable contributions to decision making regarding ED operations, they will provide a basis for learning across different departments, and could give rise to ED specific targeted solutions.

The objective of this study was to prioritise a list of quantified crowding measures, with clearly defined terms and international applicability, that would assess the current state of a department, evaluate how the department is coping with the current demands and highlight areas of concern. This paper outlines the development of these key measures by an international group of experts using a formal consensus technique.


Study design

The goal of the study was to develop a list of crowding measures. An international group of experts were invited to participate in a formal consensus technique—a Delphi study. The Delphi method was selected for the following reasons. The method is used widely in health research because of its ability to achieve ‘consensus in a given area of uncertainty or lack of empirical evidence’.11 The anonymity of the participants allows the Delphi method to overcome disadvantages normally found in group decision making which are ‘prone to domination by powerful individuals, the biasing effects of personality traits, seniority and the fact that only one person can speak at a time’.11 Since the Delphi study could be completed online, it enabled us to include participants from anywhere in the world provided they had Internet access.

Study setting

Participants were selected in recognition of their significant academic expertise in the emergency medicine processes. We invited participants from diverse backgrounds, geographical location, healthcare systems and occupations to ensure as wide coverage as possible. Participants were chosen in one of two ways. The first group invited to participate were known experts in the field. The second group was selected because of their active involvement in publishing pertinent articles. We identified potential participants through a search of the Pubmed online database using the MESH search terms, ‘emergency department’, ‘crowding’ and ‘overcrowding’ within the publication date range from 1 January 2000 to November 2009.12 The results of this search yielded 398 journal articles and 49 review articles. We invited the principle authors of relevant review articles and those who were the principle author of multiple relevant journals. In total, 55 experts were invited to participate of whom 40 agreed to participate (16 known experts, eight authors of review articles and 16 authors of multiple publications).

This study was completed online using the web based survey tool Survey Monkey,13 which was used to collect the survey results. Links to the surveys were distributed via email.

Study protocol

We used a standard three round Delphi methodology.11 ,14–21 The aim of the first round was to identify broad measures to evaluate further in the succeeding rounds. Respondents submitted defining characteristics, defined as, ‘essential features by which a crowded ED can be recognised’,16 ,19 that they felt should be included. For each defining characteristic, respondents were asked to submit an operational definition or a ‘set of directives, activities or procedures that specify how to measure, observe or record the defining characteristic’.16 ,19 At this time, respondents provided demographic information and rated their level of exposure and/or understanding of EDC. In each round there was space provided for participant comments.

In the second round, participants were asked to evaluate each defining characteristic and its operational definitions for clarity, feasibility and appropriateness for inclusion.16 In cases where a defining characteristic had more than one operational definition, participants ranked the definitions in order of preference.

The third round incorporated feedback from the previous rounds. A summary of results from the second round, including the percentage of participant agreement, was included so participants were aware of the disposition of the group.14 ,21 Participants again evaluated each defining characteristic and operational definition and ranked the operational definitions in order of preference.

Data analysis

Data collection and processing took place at completion of each round. At the end of round one the responses were collated, duplicate entries were eliminated and respondents' comments and feedback were evaluated and, when appropriate, incorporated. We characterised appropriateness as those that were relevant, operable and original. Any reference to a specific country's national standard was eliminated and replaced by a generic reference to national standards. Similar defining characteristics were grouped together. The defining characteristics were categorised as ‘input’ measures, ‘throughput’ measures, ‘output’ measures and ‘other’ measures. This categorisation was introduced by Asplin et al in 200322 and has since been used extensively in the literature. Input measures are those that affect the flow of patients into the department, such as the volume and type of care required. Throughput measures are those which affect the flow of patients and their care processes once they are in the department. Output measures are those which affect the flow of patients out of the ED; either discharged out into the community or transferred to another care site. Any data that did not fit into one of the three categories above was grouped together and labelled ‘other’ measures. The categorised data were supplied to participants for evaluation in round two.

For inclusion in the following round, statements were evaluated against a predetermined level of participant agreement. The acceptance level was set at 50% then 70% participant agreement for inclusion after the second and third rounds, respectively. The 70% inclusion rate was consistent with what had been found in the literature16 and the inclusion rate of 50% after the second round was set to enable greater advancement of statements into the third round. Each defining characteristic and operational definition was evaluated against the inclusion rate. All defining characteristics that exceeded the inclusion rate were included. The operational definitions obtaining the highest respondent agreement and in the case of a tie, the highest ranking, were included. If all operational definitions for a given defining characteristic were eliminated, the defining characteristic was also deleted. In the third round, only the operational definition with the highest respondent agreement was included.


Characteristics of study subjects

The 40 participants came from six different countries: USA (42.5%), Canada (25%), UK (20%), Australia (7.5%), The Netherlands (2.5%) and Hong Kong (2.5%). The participants were 23% female and they largely identified themselves as academics (83%), clinicians (71%) and researchers (63%). Most respondents had medical degrees (89%). When asked to rank on a 10 point Likert scale, from 1 (low) to 10 (high), their level of exposure to and/or understanding of EDC, the median response was 8.94, indicating a high degree of expertise.

Main results

There was an 87.5% response rate (35 responses) in round one. Respondents identified 27 unique defining characteristics (seven input, nine throughput, three output and eight other measures) and 101 operational definitions (31 input, 36 throughput, 18 output and 16 other measures). Participant consensus throughout the study is illustrated in figure 1. Round two had a 50% response rate (20 complete responses). Using the inclusion criteria, respondents narrowed the list to 16 defining characteristics (five input, five throughput, two output and four other measures) and 35 operational definitions (11 input, 11 throughput, six output and seven other measures). Only one output measure remained after round two and this single defining characteristic had 12 accepted operational definitions. Participant feedback suggested that this measure should be divided into two distinct defining characteristics. For this reason, two output measures were included in round three. There was a 70% response rate in round three (28 responses). Respondents identified three input measures, three throughput measures and two output measures for inclusion in the prioritised list of crowding measures. The measures and their participant consensus rates were: (1) ability of ambulances to offload (70.4%); (2) patients who leave without being seen or treated (77.8%); (3) time until triage (74.1%); (4) ED occupancy rate (100%); (5) patients' total length of stay in the ED (88.9%); (6) time to see a physician (85.2%); (7) ED boarding time (88.9%); and (8) number of patients boarding in the ED (88.9%).

On completion of round three, it was decided that a consensus had been reached and further rounds were not required. This prioritised list of quantified crowding measures, seen in box 1, includes eight key process measures that are easily identifiable in EDs.

Box 1

Measures of emergency department crowding

Input measures

  • 1. Ability of ambulances to offload

    • Ambulance offload time is the time between ambulance arrival and offload. An emergency department (ED) is crowded when ambulance offload time is greater than 15 min in more than 10% of cases.

  • 2. Patients who leave without being seen or treated (LWBS)

    • An ED is crowded when the number of patients who LWBS is ≥5%.

  • 3. Time until triage

    • An ED is crowded when there is a delay >5 min from a patient's ED presentation to begin their initial triage.

Throughput measures

  • 4. ED occupancy rate

    • An occupancy rate is the total volume of patients in the ED compared with the total number of officially designated ED treatment spaces. An ED is crowded when the occupancy rate is >100%.

  • 5. Patients' total length of stay in the ED

    • An ED is crowded when more than 10% of patients have a total length of stay >4 h.

  • 6. Time to see a physician

    • An ED is crowded when a patient waits longer than 30 min to be seen by a physician.

Output measures

  • 7. ED boarding time

    • An ED is crowded when >10% of patients remain in the ED 2 h after the admission decision.

  • 8. Number of patients boarding in the ED

    • Boarders are admitted patients waiting to be placed in an inpatient bed. An ED is crowded when boarders occupy >10% of the total occupancy.


Using a formal consensus method, this group of 40 experts developed an internationally applicable prioritised list of quantified crowding measures that consists of eight quantifiable and easily measured process measures. In order to understand the contribution of this research, it is pertinent to understand the current state of crowding research and the limitations of both the crowding measurement scales and general crowding measures identified in the literature.

Crowding measurement scales are designed to calculate if an ED is crowded or not.3 ,6 ,7 This information may be useful in auditing ED operations but the scales are unable to discern how crowding is affecting the operation of the department, they are unable to differentiate between units within the ED (eg, minors) and there are concerns over the transferability of these scales across diverse settings.8

Many authors have attempted to generate general crowding measures.6 ,8–10 In a systematic comprehensive review of the literature, Hwang et al identified 71 unique crowding measures.8 These measures have provided a basis for this study, and all eight of our prioritised measures had been previously identified in the literature.8 However, unlike the results of this study, none of the previously identified measures are quantifiable in a manner that would enable comparison between different departments.

It is in the intersection between the crowding scales and the measures that the results of this study are able to contribute. “While there remains no objective criterion standard measure of crowding in the ED, a combination of time intervals and patient counts appears to be emerging as the most promising tools for measures of flow and non-flow (ie, crowding), respectively”.8 This study prioritised eight process measures with assigned quantifiable metrics to give a more detailed picture of how the department is coping with the current demands, and highlighted areas of concern. Unlike the crowding scales, these measures can be applied to the various care units within the ED to show how crowding is affecting the units differently. It has been well documented how a single high acuity patient may utilise a majority of ED resources, causing a localised situation of crowding, however a parallel care stream within the department, such as in minors, may be operating below capacity and not suffering in the same way. In this way, this prioritised list of eight quantified crowding measures enables us to have a more comprehensive view of the ED operations. While the eight measures are calculated independently, it should be noted that they are intrinsically linked. ED operation is inevitably influenced by external factors as it operates as part of a complex adaptive system23 where ED crowding can be an initial indicator of more systematic hospital crowding.24


The Delphi technique itself poses some limitations. Anonymity, key to the study design, ‘may lead to lack of accountability of views expressed and encourage hasty decisions’,11 although the impacts of this would be difficult to measure or eliminate. Modifying the technique to include web based surveys may have posed challenges in the different ways people interact with the computer, however each of the experts invited to participate had a contact email address suggesting they were relatively adept at using online tools. The method of participant selection may have favoured recent academic activity, high profile work and authors from predominantly Western countries while excluding experts not active in publication. Publication bias may have affected the literature base that was evaluated in this study, thus affecting the selection of study participants. Participant bias may have existed because of the participant demographics. Geographical variation in care provision created difficulties in obtaining consensus. The participants represented healthcare systems in developed countries, which may affect the transferability of the results into systems in developing nations.

The online survey tool governed the question formatting, influencing the design of the surveys. The layout of the online surveys and the time required to complete them may have discouraged completion. Round two was distributed throughout December and January when many people are on holiday, which may have accounted for the decrease in response rate. When interpreting participants' comments and survey responses, assumptions were made as to the intended meaning. Despite careful attention, there is the potential for this interpretation to be incorrect. Participants were not asked to confirm the wording of the final measures, which may have resulted in some disagreement over the exact phrasing. This could have been avoided by including a fourth round.

The inclusion criteria could have influenced the direction of the study. Although most Delphi studies use consistent consensus rates, we felt important measures might be prematurely eliminated. After round three there was only one measure (measure No 1) that was close enough to the inclusion rate where one negative vote would have dropped it below the limit, omitting it from the final list.

One of the greatest challenges for participants was differentiating between the types of care centres. Participants found it difficult to compare different types of hospitals (eg, teaching hospital vs rural community hospital) and those with different processes (eg, the timing or presence of preregistration triage, registration, triage, bedside registration). Approaching this issue with a multilevel approach, developing measures for EDs in different hospitals in different countries with different healthcare policy strategies, proved challenging. This made it difficult to impose a standardised time of service delivery. Further work could confirm whether these measures are appropriate at various types of hospitals, or whether their specific functions lead to the identification of different measures. Regional variations were accommodated for by providing clear definitions of terminology and using percentiles as opposed to absolute numbers to accommodate the differences in ED volume and demand.

Overall, there were concerns as to whether the Delphi technique was appropriate given the complexity of the issue. We argue that despite this study's limitations, it quite successfully generated a consensus among participants giving us confidence that an internationally applicable consensus on key crowding measures could be identified. In summary, we feel that the measures we have identified have high face validity.


The objective of this study was to prioritise a list of quantified crowding measures, with clearly defined terms and international applicability, that would assess the current state of a department, evaluate how the department is coping with the current demands and highlight areas of concern. This was accomplished by enlisting an international group of experts to participate in a Delphi study. The study participants, after three successive rounds, were able to identify eight quantifiable measures of ED crowding; all are key process measures that are easily identifiable in EDs. We ensured international transferability, overcame regional variation in terminology by providing clear definitions and used percentiles as opposed to absolute numbers to accommodate for the differences in ED volume and demand.

The principle contribution of this study, which progresses the current crowding literature, was the generation of quantifiable measures. These measures provide a more comprehensive view of the ED operations and highlight areas of concern. This knowledge could make a considerable contribution to decision making regarding ED management and provide a basis for learning across different departments. Future work needs to validate these results. Acceptance of these measures will progress this field of research to the point where we can focus on developing targeted methods of management and minimisation of crowding.


View Abstract


  • Funding UK Engineering and Physical Sciences Research Council (EPSRC), grant No EP/E001777/1. The role of the EPSRC was to provide partial funding for KB's PhD training. The researchers were independent of the EPSRC. At no time did the EPSRC attempt to influence the direction of the research undertaken.

  • Competing interests None.

  • Provenance and peer review Not commissioned; externally peer reviewed.

Linked Articles

  • Primary survey
    Darren Walter