WFM Q&A: Data Cleaning

How to Improve Contact Center Scheduling Practices
Illustration by David Grey for Pipeline

My friend Bruce, who I met at the SWPP conference earlier this year, recently contacted me with a list of questions about cleaning raw data used in forecasting. This is a transcript of that interview, which we hope may be of use to others who are just starting out with their own rules about cleaning ACD data.

BRUCE: Do you clean the data if there is a known cause for the aberration?
TIFF: Yes, especially when there is a known cause—especially if the reason is something that is only happening once and is not going to repeat. And I always keep my original data intact, with a new storage place for cleaned data (in my case, a new excel column).

BRUCE: …or just whenever things vary greatly?
TIFF: Sometimes, but not always. I want a certain amount of normal volatility to remain in my controlling patterns. Since I’m using multiple weeks of history in my time-of-day and day-of-week patterns, there is a buffer of forgiveness that it’s going to grant with those spikes, even with weighted moving averages.

BRUCE: Is there a certain percentage from normal you should use as a gauge?
TIFF: When cleaning answered and overflow/dequeued volumes, I use my forecast potential metric to guide me on when something is an alert vs. when it needs to be left alone. Forecast potential is the output metric from calculating MAPE.

When I’m cleaning abandons, I use the abandonment goal as the key to identify what I want to clean. So if the abandoned goal is 5% or less, I clean abandons over 5%.

When cleaning handle times, I look at the four handle time elements separately: ring time, talk time, hold time and wrap-up/ACW time. With ring time, I use anything over 12 seconds (3 seconds per ring, so anything over 12 means it was ringing more than 4x). Talk time is monitored with a standard deviation of the talk time average (I use 2x standard deviations). Hold time is less scientific, but I start with 2.5 standard deviation with it. And for wrap-up time I use a more manual approach: 2x the wrap-up expectation set by management. So if they have decided wrap-up time should average 45 seconds, I set a control that red-flags it for cleaning when it is higher than 90 seconds.

All of these cleaning approaches are my starting points; I usually still end up fine-tuning them even more once I get going with a new forecast group.

Also, a note about using 2x standard deviation: For some reason, I always seem to land on a sweet spot of 1.75 standard deviation instead of 2x. I’m not sure why that is exactly, but my forecasting methods just really like that position and it works out nicely more often than not.

BRUCE: How often should you do this?
TIFF: I do it every time I pull in new history. Some clients send me their data once a week; others send it once a day. Some give us dial-in access and we do it daily. I’d recommend daily if you have the bandwidth, because you’ll have a faster reaction time to seeing things that have gone wrong which the rest of the call center may not be aware of. If you wait to do it weekly and then try to investigate what was happening last Wednesday at 2 p.m., it may be harder to find people who remember that far back.

BRUCE: Do you adjust data at the daily level or should you be adjusting it at the interval level, as well?
TIFF: Good question—I adjust abandons only at the daily level. And remember, this means I’m adjusting the data I’m using in my patterns, I’m not adjusting the actual raw data—all of that stays intact forever. The answered/overflow/dequeued volumes and all of the handle time metrics are adjusted at the interval level, because those are the metrics that I use when drilling down to time-of-day patterns. (I don’t use abandons in my time-of-day patterns.)

BRUCE: I’m sure there are techniques and procedures that I haven’t even thought of. If you can point me in the direction of any articles, websites or books that might help?
TIFF: Yes, check out my book, Diary of a Workforce Manager. There is a full list of resources in the back, too.