You are measuring the wrong things.
It’s time to rethink the approach to speech analytics
Five years ago, one of the UK’s largest retailers rolled out a new way of measuring performance for its 1,400 contact centre agents. A speech analytics system, like a giant corporate version of Alexa, would listen to all of the call centre conversations with the customers. It measured what was said and presented this in a shiny dashboard for all managers and staff to see.
“Your calls may be monitored for training and monitoring purposes…”
Dazzled by the dashboard
Pretty impressive, right? It looks damn good and there was buy-in right from the top. Perhaps understandably, as the investment on licences stood at well over £1.5m. The speech analytics vendor also made very compelling claims to justify the expenditure too. Customer experience (CX) scores were projected to “vastly increase”, all quality assurance (QA) call listening functions could be automated (with a 5-year £2.5m OpEx saving); and ‘morale’ would improve too as performance was clearly benchmarked and understood.
The outcome? Three years after implementation the system was deactivated.
So what happened? Well, performance in the call centres actually got worse. More customers kept calling back, with 14% of all customers calling back within the hour. As the retailer now offered a whole host of contact options, one director suggested this figure could be as high as 20% repeat contact across all channels (Twitter, Facebook messenger, in-app messages etc). When you have 8-million phone calls per year, and 20% of your customers contact you again, within the hour, you know you have a problem to fix.
So what went wrong
As a conversation scientist, I’m unapologetically going to go deep here; and dive into the minutiae. There are some “geek-out” sections that you can skip over if needed, but there’s no exec summary. The intent here, and for the article itself, is to give some hints and tips to getting the source coding right and explain why the out-of-the-box speech analytics isn’t sufficient to deliver results.
Let’s start with looking at what was being measured. There are three ‘human’ speech categories set up as defaults.
Speech category 1. “I use positive language”
In itself, a speech category of ‘positive language’ seems like common sense. Perhaps it’s a little too reminiscent of the 1990s “smile when you dial” mantra. But what actually is being measured?
The system is monitoring the frequency of the terms such as, “I can do that / I’ll do that / Not a problem,” or the customer service classic, “Absolutely!”
Not necessarily the anticipated archaic “delight your customer” concepts, but having personally analysed thousands of hours of contact centre conversations for companies like Thames Water, Vodafone, BT, William Hill, Anglian Water, these are not phrases that correlate with customer satisfaction (C-Sat or NPS, depending what’s being used).
Acknowledging the customer issue, demonstrating ownership and accountability, taking responsibility and offering reassurance are the behaviours that drive high CX scores and, so far, these are not being measured by the analytics tool. To improve the coding above, I would strongly recommend that measurement and addition of phrases like:
• I totally understand your frustration
• I acknowledge your concern
• I will ensure…
• I will make sure…
• I will own this (for you)
• What I’m going to do is…
• What I will do is…
• Don’t worry
• I’m on this
• I’ll sort this
• What is going to happen is…
• Somebody is going to call you (within 10 minutes)
Speech category 2. “I’m confident in what I say”
To use an Americanism, being confident in what you say appears to be a ‘no-brainer’ for delivering great customer service. However, this isn’t what the tool measures at all. You see, it’s very hard to measure ‘confidence’ per se, so the tool uses an inverse measure and lists a whole bunch of words that you must not use, otherwise you will be penalised and your shiny dashboard will show up red.
To be clear, if you use these phrases, you will be penalised… “Nothing I can do” / “Nothing we can do.”
These are cries of a disengaged, call-centre generation. How many times have you heard, “Sorry, but there’s nothing we can do.”?
It’s infuriating, right? It’s something we term, ‘Agent Intransigence’. And from our work with big-corporates, this is highly correlated with low C-Sat scores from the customer. In fact, we did some deep-dive conversation intelligence work with one financial services firm, and the presence of agent intransigence is the single biggest predictor of escalated complaint calls. And this particular firm suffered from 30,000 FCA listed complaints per year.
the presence of agent intransigence is the single biggest predictor of escalated complaint calls.
However, other words being assessed to measure a lack of ‘agent confidence’ are merely, colloquial- or idiosyncratic-verbal tics. Something learnt by the environment you grew up in. “To be fair” and, “To be clear” are found frequently in conversations from Glasgow and Liverpool contact centres. “To be honest”, is found all over the UK, but disproportionally high in the North-East, especially Sunderland and Darlington.
Now, remember what I mentioned earlier, about this being an inverse measure. So, if you use these words; words that you grew up using, you will be penalised. Given I hail from the North East, I’d like to be fair, clear and honest and state it’s a terrible measure, to give agents a negative score if these words are present. If these are currently built into your speech analytics installation, you are doing your employees a terrible disservice. And worse still, there’s some inherent provincial-xenophobia built in that penalises Glaswegians and Liverpudlians agents over other UK areas.
Amusingly, not used at this retailer, but more sophisticated systems can measure tone, pitch and emotion of the agent conversation; and inadvertently penalise any agent from Belfast as “ALWAYS SHOUTING”!!!
What needs tweaking and changing? The one behaviour observed from all of our conversation intelligence work that correlates with extreme customer dissatisfaction is ‘agent vagueness’. A lack of specificity on timescales from agents in both the UK’s biggest utility company and the UK’s largest Telco, is statistically, the single biggest predictor of customer dissatisfaction. When agents are vague with their responses, customers do not feel confident. So to improve the coding above, I would strongly recommend the (inverse) measurement of phrases like:
• It should…(arrive)
• It might …(arrive)
• Somebody will contact you soon
• Somebody should be in touch
• this should
• well maybe
• hopefully
• sometime
• sometime later
• sometime today
• probably
• possibly,
• any minute now
Again, this is an inverse measure so we are not advocating the use of these phrases. We are saying, if you truly want to measure a lack of confidence by the agents, the ‘absence’ of these phrases will help you to correctly codify confident customer conversations.
Bonus geek out section…
Now, for those of you who are true data scientists and thinking, “Coding, coding? We don’t even code em’ queries at all..” Don’t worry, I will come on to the use of machine learning and predictive speech analytics in another article. But right now, I’m focused on coding the right queries to ensure that your speech intelligence is built on the right conversational foundations, and not spurious 1990s L&D jargon, or xenophobic content that penalises certain agents.
Even though 80% of large organisations don’t have a speech analytics system in place, the big “on-premise” installations from companies like Verint rely on hand-written queries for the behaviour tracking and from what I’ve seen, none of them are measuring the right behaviours.
Speech category 3. “I close my calls appropriately”
Oh boy. This is the category where the coding is utterly broken and has zero correlation with customer satisfaction. For years and years, agents have been trained to ask, “Is there anything else I can help with?” at the end of their calls.
Do you know the correlation of asking this, this with the Customer Satisfaction score? Absolutely zero.
Trite in — trite out
I’m steadfastly unapologetic here. Just because you can measure something, it doesn’t mean you should. The speech analytics coding is designed to look for the presence of these words and give agents a positive score if they ask, “Anything else I can help with?” It’s a trite behaviour that bears no meaning on the conversation.
I discussed this with a professor of conversation science who explained,
“It’s a token question designed to get a no… and questions with an “any” in them always get a no.” She then explained, “‘Is there something else’ is a better one to ask.”
When we see that lack of giving timescales has a significant correlation with extreme customer dissatisfaction, what’s needed here is a positive score for agents that clarify and give time-bound specifics on when things will happen. To improve the source coding above, I would strongly recommend that measurement and addition of phrases like:
• The next steps are…
• What will now happen is…
• What is going to happen is…
• So to summarise…
• In summary…
• To summarise…
• The next thing will be…
A civil war
We took 33,000 agent conversation scores (0 to 3, from the speech analytics system) and correlated this, with the relevant customer satisfaction score (C-Sat 1 to 5). You don’t need to be a statistician to glean that there is zero correlation with what’s being measured and the customer satisfaction score.
If you are on board with the thinking so far, what we have here, is a failure to communicate. Actually, I’ve slipped back into quoting Guns n’ Roses. What we have here, is a failure to correlate what is being measured with customer satisfaction, let alone the conversations that prevent repeat contact (remember that 20% of all customers get back in touch within the hour).
What’s even worse, is the leading FTSE100retailer then started creating league tables amongst the 34 line manages, to determine which agents were demonstrating the badly coded behaviours. In perhaps, the most ridiculously redacted slide ever, you can just about see that there is a league table created, for the team that most consistently closed their calls with the trite statement, “Anything else I can help you with?”
Making AI human
Today, tech boffins and not conversation scientists appear to be setting up most speech analytics installations. If you are in the market for a new system, we can probably help guide you and avoid the many pitfalls we have seen. You don’t need to invest in a multi-million pound licence. We can give you absolute clarity on the behaviours driving CX and Sales from a diagnostic involving c20,000 calls. It helps if you have them tagged as NPS detractors / promoters.
We advised O2-Telephonica on how to maximise their Verint system investment and are on-board with all tech providers. We are solution agnostic too; so if you’d like to deliver a return from your on-premise Speech Analytics system, then we can help you realise that. To be clear, we are not trying to usurp your existing provider or undermine your investments. Pairing your existing technology investment in Verint, Nexidia or CallMiner, with the best brains in conversation science is going to lead to a better outcome and a better return on your investment.
Our goal is to spread the science of conversational intelligence and ensure leading organisations aren’t wasting money, time and effort on something that isn’t set up correctly.
Thank you for reading this impassioned opinion piece! if you’d like to make contact, you can do on here (I believe) or through our LinkedIn page.