Analysis of Emergency Room Waiting Time in SASBrent Wenerstrom, University of Louisville, Louisville, KYABSTRACTBackground: Life and death may be on the line for patients visiting an emergency department (ED). Thetime it takes a patient to see a doctor can be critical. This is made more difficult by the fact that waitingtimes have been increasing over the past ten years [2]. Objective: We would like to determine whatfactors impact the current waiting in emergency departments through the use of Enterprise Guide .Methods: We are using survey data from the 2006 Ambulatory Health Care Data survey [1] conducted bythe National Center for Health Statistics (NCHS) containing about 36,000 data records. We used linearregression to model our data with the help of the PROC GLM function. Results: We found that selfreported pain levels did not correlate with waiting time, but that ED prioritization, time of day of visit,arrival by ambulance and previous waiting times from the same emergency department correlated withwaiting time.INTRODUCTIONA study by Nawar et al. [2] shows that from 1995-2005, emergency department visits have increase froman estimated 96.5 million in 1995 to 115.3 million. This represents a 31 percent increase in visits. Inaddition, the number of EDs has decreased during that same period of time from 4,176 to 3,795. Certainlythis is going to have an adverse affect on the waiting time of a patient in the ED. Unfortunately, acomparison cannot be made between waiting times during this period. Waiting times in the survey usedwere capped through 2004, but are no longer capped.In this study, we seek to identify those factors that most impact modern day waiting times. Specifically, weattempt to predict a patient’s waiting time based on factors known at the time of their arrival.METHODS GIVE A REFERENCE TO THE NHAMCSIn this study, we use data collected from 360 EDs in the National Hospital Ambulatory Medical CareSurvey (NHAMCS) [1] for the year 2006. There are 35,849 patients recorded in this sample. Of those,only 28,391 had known waiting times. We used SAS Enterprise Guide to process and model the data.The original distribution of waiting times can be seen in Figure 1. From this figure, we can see that thereare a large number of small waiting times and a few extreme waiting times. The longest waiting time was1430 minutes or 23 hours and 50 minutes, while there were 770 cases with waiting time of zero minutes.Rather than model the actual waiting times, we chose instead to work with the log of the wait time. Ahistogram of the log of the waiting time can be seen in Figure 2, where one was added to the waiting timeto avoid taking the log of zero. The shape of the data is now more normal and will allow more reliablelinear regression analysis.We originally computed a linear regression model on a random sample of 2,000 records, enabling thegraphing of residuals and as a means to avoid inflated p-values merely based on a large sample size.For each patient record in the survey, there was an associated hospital ID (HOSPCODE). We would liketo obtain a measure of past waiting times for each hospital. We did not have enough information to orderrecords by date, and instead sampled randomly from each hospital as an approximation to using anhistorical mean. We did this by sampling from each hospital using PROC SURVEYSELECT with STRATAset to HOSPCODE. We then used PROC SQL to combine the average log waiting from this sample whileat the same time removing the sample from our data. The PROC SQL code can be seen in Figure 3.Wecomputed the average log waiting time by hospital ID. We then compute the regression model on the67% of the records for each hospital not used to obtain the average log waiting time. The regressionmodel itself will be shown in the following section.

Figure 1: Histogram of known waiting times.Figure 2: Histogram of the log of known waiting times.

Figure 3: PROC SQL obtaining an average waiting time from a random sample and removing thatsample from the data.PROC SQL;CREATE TABLE SASUSER.TRAIN WITH AVG ASSELECT LOG WAITTIME, WAITTIME, IMMED, IMMEDIACY, AMBULANCE, BLACK, TIME, TIME2, HOSP AVGFROM SASUSER.QED2006 QJOIN (SELECT HOSPCODE, AVG(LOG WAITTIME) AS HOSP AVGFROM SASUSER.HOSP TRAIN SAMPLEGROUP BY HOSPCODE) AS HOSP AVG TON Q.HOSPCODE HOSP AVG T.HOSPCODEWHERE NOT EXISTS (SELECT PATCODEFROM SASUSER.HOSP TRAIN SAMPLE AS TRAINWHERE Q.PATCODE TRAIN.PATCODE AND Q.HOSPCODE TRAIN.HOSPCODE);QUIT;RESULTSOur computed regression model can be seen in Table 1 obtained through PROC GLM. All parameterscan be seen to be significant. Additionally, the model achieved an R2 value of 0.307. This means thatthere is still a large percentage of variability that is not accounted for, but a linear correlation in the dataexists.Table 1: Regression model for log waiting time.ParameterInterceptAMBULANCEHOSP AVGTIME2IMMEDIACY 1 - 14 minIMMEDIACY 15 - 60 minIMMEDIACY 1 hr - 2 hrsIMMEDIACY 2 hrs- 24 hrsIMMEDIACY 37680340.04034898.t Value-10.70-21.4771.919.3414.8627.4830.2327.78.Pr t .0001 .0001 .0001 .0001 .0001 .0001 .0001 .0001.

Figure 4: Residuals of model on sample of 2,000 data points.The residuals for this model could not be plotted directly in SAS. There were too many points to be kept inmemory to display all residuals. Instead, we took a random sample of 2,000 points from this data set tocompute a duplicate model. The residuals can be seen in Figure 4. For the most part, there is no obviouspattern in the residuals. We do see a couple of lines formed by the residuals on the lower left hand side ofthe plot. These come from the large number of patients with a waiting time of 0. The regression model forthe sampled 2,000 points can be seen in Table 2. Again all p-values are significant and the coefficientsfor both the model based on 28,000 data points and the model based on 2,000 data points are verysimilar.Table 2: Regression model using a sample of 2,000 data points.StandardParameterEstimateError-0.822908953 B 0.9379475650.03509879HOSP AVG0.0167917010.00373242TIME20.523127438B 0.11768707IMMEDIACY 1 - 14 min0.913631641B 0.10631610IMMEDIACY 15 - 60 min1.021281228B 0.11508944IMMEDIACY 1 hr - 2 hrsB 0.12212009IMMEDIACY 2 hrs- 24 hrs 1.0192084670.000000000B .IMMEDIACY Immediatet Value-5.13-6.8826.724.504.458.598.878.35.Pr t .0001 .0001 .0001 .0001 .0001 .0001 .0001 .0001.

Figure 5: Kernel density estimators for log waiting time by pain levels.One surprising find in this model is that pain did not correlate with waiting time. This can be seen from akernel density estimator for the log waiting time separated by various self reported pain levels seen inFigure 5. From this figure, we can see that there are no major differences in the distribution of waitingtimes when comparing different pain levels. The only noticeable differences are the blue line,representing a blank pain level, does not peak at the same point as the rest of the distribution and thepink line, representing an unknown pain level, is highest at a log waiting time of zero. The differences inthe blank pain level can be attributed to the fact that there were very few patients with a blank pain level,only 2.5% or 716 patients, which is far fewer than any other category. This would lead to a less smoothKDE. A reason that an unknown pain level has a higher density at zero may be that patients who areunconscious when entering the emergency department are recorded as an unknown pain level. Theoverall distribution of an unknown pain level is very similar in every other way to the rest of thedistributions of log waiting times. This means that having an unknown pain level still does not providemuch information about log waiting time.Figure 5 was produced using the PROC KDE function. This function uses kernel density estimators toapproximate a the generalized probability density function for the given data. The code used to generateFigure 5 is given in Figure 6.Figure 6: SAS code used to produce Figure 5.PROC KDE DATA TMPSORT PAIN;UNIVAR LOG WAITTIME / OUT LOGWAITKDE PAIN;BY PAIN;RUN;

Figure 7: Kernel Density Estimator of log wait time by ambulance arrival.There were a number of variables that were found to correlate with waiting time. Of these, the first is anindicator variable called AMBULANCE, which indicates that the patient was known to have arrived byambulance to the ED. In Figure 7, we show the distributions of wait time for the ambulance variable. Thisgraph was produced using the same steps as Figure 5 using PROC KDE. This graph shows us two verysimilar distributions, but those that came by ambulance are shifted to the left and have a much higherdensity near the value zero. This means both that patients who arrived by ambulance on average have ashorter waiting time and that a higher percentage of patients who arrive by ambulance have a waitingtime of zero. Additionally, the coefficent for the ambulance variable suggests a negative correlationbetween waiting time and arriving by ambulance. One would expect that people who need to be taken tothe ED by ambulance oftentimes have more serious needs and should be seen before those who arrivedunder their own power or with the help of friends or family.The next variable we will discuss is TIME2. This variable was calculated by taking the hour of the arrivalby military time and adding the minutes divided by 60 where the minutes of the current time become thedecimal part of TIME2. The correlation of the time of arrival and the length of waiting time can be seen ina scatter plot of the two variables as is shown in Figure 8. This graph was created using EnterpriseGuide’s scatter plot functionality. On this scatter plot, we are using the “Interpolations” option to add aquadratic regression line with corresponding 95% confidence interval. From this graph, we can see thatthere is a slightly increasing waiting time as the day wears on. In the early morning from about 1 AM to 8AM, there are very few people visiting the emergency department as few people are active during thistime of day. With far fewer visits during the early morning hours, there are fewer people in the EDcompeting for the same staff. However, as the day goes on, there are an increasing number of visits tothe ED, which increases the lines and waits of those patients to follow. Expect longer waits if you visit theED later in the day.

Figure 8: Scatter plot of time of arrival and log waiting time.We now turn our attention to the variable, IMMEDIACY. This variable refers to the prioritization systemused in many hospitals today. This form of prioritization is referred to as triage. In our data, there were 5main groups that an incoming patient could be assigned to: “Immediate”, “1 – 14 min”, “15 – 60 min”, “ 1hr – 2 hrs”, and “ 2 hrs – 24 hrs”. These categories refer to how long a patient may be able to wait to beseen. Some patients require immediate attention due to a life threatening condition while others may visitthe ED to take advantage of medical attention that does not require health insurance or for service duringodd hours. For example, there were 361 patients of the 35,849 in the original data sample who were notcharged for the medical service provided. Additionally, there was a slight increase in patients during thehours following normal doctor office hours.In Figure 9, we can see the distribution of waiting times according to the various priorities listed above.The figure shows that the distribution for the “Immediate” category has more density near the values 0and 1 (0 minutes and 2 minutes) than any other category. These patients are generally seen more quicklythan any other group. The next group in priority, “1 – 14 min” peaks around the value 2.5, or around 10minutes. This peak is much sooner than the peaks of lower priority categories. The distributions generallyshow that higher priority results in a short waiting time, making this variable valuable for prediction.However, what is surprising is that there is large overlap among all of the groups. The “Immediate” groupstill includes patients who have long waits. For example, the longest known waiting time for patients put inthe “Immediate” category in the data was 15 hrs. 25 min. There were 165 cases where patients who wereput in the “Immediate” category were seen by a physician in an hour or more. On closer look, none of thepatients who were put in the “Immediate” category and seen by a physician an hour or more later died.There were 62 cases in this data set of patients either dead on arrival (DOA) or dying in the ED. Of these62 cases, 48 had known waiting times, and of these 48 patients who died, only 2 of them had waitingtimes over 16 minutes. The waiting times were 2 hrs. 35 min. and 1 hr. 13 min. It appears that very rarelyis a patient with a severe threat of death made to wait for an extended period of time in the ED.Before being used in the regression model, it was necessary to combine the categories “Unknown” and“No triage” with the most similar category in the data as an imputation technique. The group “15 – 60 min”was chosen, being the middle group with a distribution found to be most similar visually to each of thesegroups.

Figure 9: Log waiting time according to priority.Figure 10: Kernel Density Estimator for log wait time after subtracting out the part of the modelthat uses all other variables.Error! Reference source not found.InFigure 10, we see a very different kernel density estimator graph from that pictured in Figure 9. InFigure 10, we took the log wait time for each patient and subtracted out our regression’s prediction exceptfor the contribution in the model from triage. We did this using PROC SQL as shown in Figure 11. Wethen plotted the distribution of the resulting log wait time. The new distribution plot shows us that themodel accounts for much of the variability shown in Figure 9. However, there is still a large difference inthe log wait time for the “Immediate” group compared to the other groups. The averages for the othergroups are all slightly different, with the exception of “ 1 hr – 2 hrs” and “ 2 hrs – 24 hrs” being evenmore similar than before. There appears to be very little difference in these two last groups. The model’scoefficients even put a slightly lower log wait time on “ 2 hrs – 24 hrs” group than on “ 1 hr – 2 hrs”. Thelowest priority group has a coefficient of 1.121 versus 1.139. The difference is slight, but surprising sinceit rearranges the priorities in terms of actual wait time.

Figure 11: PROC SQL code used to produce data behind Figure 10.PROC SQL;CREATE TABLE SASUSER.TRAIN PLUS MODEL ASSELECT LOG WAITTIME-(-0.585317570 -0.447478461*AMBULANCE 0.011577446*TIME2) AS WAIT FOR IMMEDIACYFROM SASUSER.TRAIN WITH AVG;QUIT; 0.873940524*HOSP AVGFigure 12: Average sampled wait time per hospital versus wait time.Figure 13: Box plot demonstrating variation between hospital wait times with whiskers extendingto minimum and maximum values.The most useful of the variables for predicting waiting time is previous waiting times for a given ED. Therewere 360 different hospital EDs in the data. Each patient record had a hospital ID associated with it. Thepatients’ log waiting times that are predicted were not used to calculate the sampled average for thathospital. One can see a scatter plot of log waiting times versus sampled average hospital log waiting timein Figure 12. Additionally, we have plotted a linear regression line only using hospital average waiting

time with its corresponding 95% confidence interval. It is apparent from this graph that a linearrelationship exists between hospitals’ previous performance and current performance, as would beexpected.Figure 12 also indicates that there is a great deal of variation between hospitals. This is more clearlyshown in the box plot in Figure 13 drawn using Enterprise Guide. Here, we have the top 5 and bottom 5hospitals based on their average waiting time. We can see that the top 5 hospitals in the survey had verylittle variance compared to that of the bottom 5. For example, the top hospital had 111 patients recordedin the data. The average wait time for the top hospital was 6.6 minutes with a maximum recorded waitingtime of 93 minutes. Now compare this to the bottom ranked hospital, which had 87 patients recorded withknown waiting times. Of these patients, the average waiting time was 9 hrs. 21 min., the minimum was 13min. and the maximum was 23 hrs. 50 min. The shortest recorded waiting time at the bottom rankedhospital was nearly twice what the average waiting time was at the best hospital.DISCUSSIONIn building a regression model in SAS, we learned that arriving by ambulance, time of arrival, previouswaiting times in the ED and the immediacy given to a patient all are correlated with a patient’s actualwaiting time. The variable most strongly correlated is the previous waiting times in a given ED. Clearly notall EDs are able to have short waiting times. Those EDs with the worst waiting times need a much closeranalysis to determine the root causes of such long waits (23 hrs. in the worst cases). It may help torevisit the reasons why some 381 EDs have been closed from 1995 to 2005. Additional analysis may alsoprovide hints as to which geographical regions are most need of additional emergency departments.Another improvement that could be made would be to adjust when doctors are scheduled to be in EDs sothat there are an increasing number of doctors throughout the day. This study does not provide clues onhow to decrease the number of visits to EDs across the country, but we can see that waiting timeincreases gradually through each day. With improved scheduling, Figure 8 could potentially be flattenedout to remove the correlation between time of day and waiting time.This survey suggests that current prioritization methods are effective for those patients closest to death.Patients who did die before arriving at an ED, in an ED or in the hospital after visiting an ED rarely had towait very long. In terms of treating patients most at risk of dying quickly, the current emergencydepartment protocols are meeting needs. However, given that there are individuals waiting nearly 24hours, there are major improvements that still need to be made in emergency care.REFERENCES[1]Centers for Disease Control and Prevention. "Ambulatory Health Care Data," /ahcd1.htm.[2]E. W. Nawar, R. W. Niska, and J. Xu, “National Hospital Ambulatory Medical Care Survey: 2005emergency department summary,” Adv Data, no. 386, pp. 1-32, Jun 29, 2007.CONTACT INFORMATIONContact the author at:Brent WenerstromUniversity of LouisvilleJ.B. Speed Building – Room LL14Louisville, KY [email protected]

1430 minutes or 23 hours and 50 minutes, while there were 770 cases with waiting time of zero minutes. Rather than model the actual waiting times, we chose instead to work with the log of the wait time. A histogram of the log of the waiting time can be seen in Figure 2, where one was added to the waiting time to avoid taking the log of zero.