9

Only weenie analysts trust ticket meta data

 3 years ago
source link: http://rachelbythebay.com/w/2011/12/01/timeworked/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Only weenie analysts trust ticket meta data

I was a support ticket and phone monkey for a couple of years, and I have quite a few things to say about ticketing systems and extracting data from them. I ran into a good number of pointy-haired analyst types who thought that just because a field existed in the system that it gave meaningful data. They were completely disconnected from reality.

In one system I used, there was a part in the "add comment to a ticket" workflow which let you also "log work". You had a number of choices: you could skip logging work, you could call it "free troubleshooting", or any number of fine-grained things which someone had added along the way. Some of them had dollar signs attached which meant they actually charged the customer for the work.

The important thing here is that almost every customer had an unlimited amount of ticket and phone support. That meant "free troubleshooting" was almost always the correct answer. If you were just throwing a ticket over the wall to another queue with a comment, "skip logging work" seemed fine since you didn't really do any work!

The only thing worse than no data is bad data, and this particular system accumulated tons of bad data. People would log work of 5 minutes just as a matter of habit. Even if they were just pushing a ticket somewhere else and adding a meaningless comment, they'd log "5 minutes". You could look at the audit log and see they only had the ticket assigned to them for a minute at the most, but still it somehow added up to 5 minutes in their mind.

Here's what happened. The bean counters got it in their heads that you could take the "work logged" data and cross it with the "ticket category" data and figure out what takes the most effort. In theory, that would work. In theory, we'd also have magical unicorns in every home as hat racks. In practice, it fell flat.

First, ticket categories are garbage. Tickets tend to find a way to not fit into any pre-existing category while also refusing to limit themselves to one category at a time. Consider this situation:

  • Service goes down -- monitoring alert, support team
  • Investigate at console -- troubleshoot, data center
  • Machine is placed on KVM -- troubleshoot, support team
  • Hard drive has failed, replace -- hardware, data center
  • Reinstall base system on new disk -- software, data center
  • Restore from last backup -- restoration, backup team
  • Verify site and make adjustments -- troubleshoot, support team

All of that could, did, and probably still does happen in one ticket. Which category is it? None of them and all of them.

If you really wanted to capture this data, you'd turn it around and make it happen automatically as the various people did their jobs. When the inventory team checks out a new drive for that box (and checks in the old one), have the inventory system add it to the ticket. When the reinstall/kickstart runs, have it add that to the ticket. The restore system should also add something to the ticket.

Right there, you could tell that all of these things happened. By looking at the time stamps, you could figure out how long it took. If you see multiple events, you can also use that as a signal that the system is broken somehow, since the humans involved had to kick it off more than once. If you have a wonky restore system, this might catch it.

Forget about categories. If you want to know what's going on in your tickets, analyze the text. People are going to write in and say that something is wrong with X, or Y, or Z. Take the contents of those tickets, split them into words, squash them to lower case, sort them by frequency, and throw away syntactic sugar. You should wind up with a list of terms which keep coming up. Those are the things which are keeping your techs busy.

Of course, that kind of analysis takes real work. You can't just write some ridiculous SQL join and call it done. You have to actually deal with text, words, letters, stemming, and all of that business. No wonder mere weenie analysts can't figure it out.

Here's my final complaint about this kind of analysis. Most of the time, it seems these people want numbers to make graphs so they look useful. They show these graphs to various middle manager types, and everyone just looks and murmurs. Someone says "wow", someone else says "very nice", and then they move on.

Nothing changes. It's just a big show to make it look like they know anything about the actual business.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK