Saturday, May 10, 2014

“Top Box or Top Two Box … Or Other Metrics – That Is the Question.”

Or is it really? In other words, does it really matter which metric we use to measure and report dealer satisfaction with the support they receive from their manufacturer? A debate – if not as old as Shakespeare’s “Hamlet” – at least as old as our surveys themselves. In this blog, we’re going to try to settle that debate.

While we conduct our surveys all around the globe, let’s start closer to home and use the 2013 North American Automotive Parts and Service Manager Surveys for our analysis. For the Parts Manager Survey, 11,700 responses were submitted. The overall average response rate for 19 U.S., 8 Canadian, and 10 Mexican OEMs was 66%. Nearly 9,000 Service Managers from 19 U.S. and 10 Canadian brands responded to the Service Manager Survey for an even higher overall response rate of 76%. Clearly, a very robust foundation. This analysis will look only at 19 U.S. brands and at the scores to the “all-in” question, “How satisfied are you with [OEM]’s total support for your parts/service business?”

In our surveys, we use a five-point satisfaction scale, ranging from “Very Satisfied” and “Somewhat Satisfied” over “Neutral” to “Somewhat Dissatisfied” and “Very Dissatisfied”. Now, we could also write a blog (or several blogs) about the nature of satisfaction scales. Suffice it to say that the academic discussion about which satisfaction scale to use probably predates “Hamlet”, and is equally undecided. We prefer the simple and intuitive five-point scale: our respondents have businesses to run, and we need to make it as easy as possible for them to respond. Needless to say, once a survey is running, it is difficult to change scales, as we would lose historical comparability – in our case, more than ten years.

So, based on the five-point scale, what metrics are available? We will take a closer look at four candidates:
  • Top Box Score: This is calculated by dividing the number of “Very Satisfieds” by the total number of respondents for a particular question: (Very Satisfieds)/(Total number of respondents). This is the “official” metric for our North American Surveys.
  • Top Two Box Score: Similar to “Top Box”, but adds the Second Box (“Somewhat Satisfieds”) to the numerator, so: (Very Satisfieds+Somewhat Satisfieds)/(Total number of respondents). We used this metric until 2012. We will explain why we switched to Top Box in a bit.
  • Average Score: For this calculation, we convert the score labels into numeric values: “Very Satisfieds” = “5”, “Somewhat Satisfieds” = “4” … down to “Very Dissatisfieds” = “1”, and then simply calculate the average. Traditionally, we have not used this metric for reporting scores.>/li>
  • Net Promoter Score (NPS): To be clear: this methodology is rarely used to measure satisfaction, but the strength of customers’ recommendations and advocacy. NPS typically uses a 10-point scale, where “9” and “10” are the “Promoters”, “7” and “8” the “Passives”, and the rest are “Detractors”. You then take the percentage of customers who are “Promoters” and subtract the percentage who are “Detractors”. Obviously, our five-point scale does not perfectly fit this methodology, but let’s give it a try. To mimic NPS the best we can, we are going to use: (Very Satisfieds - (Somewhat Dissatisfieds + Very Dissatisfieds))/(Total number of respondents).
After calculating the scores for the “Overall Satisfaction” question, we have a set of four scores for each OEM. What is next?

Obviously, each individual OEM’s absolute scores matter, and it is great if an OEM shows year-over-year improvement. But, clearly, it is NOT so great if others have improved MORE than you have. So, we not only care about absolute performance, but also relative performance – or an OEM’s RANK within the industry. Going back to our original question: Does the scoring methodology we use significantly impact the relative rank of an OEM? Let’s determine the rank of each OEM for each of the four scoring methodologies and look for significant differences in rank across methodologies.

For both surveys, the differences in ranks are quite minor across scoring methodologies: “green” OEMs for Top Box are almost always “green” for the other methods. The same is true for “yellow” and “red”.

In most cases, ranks differ by one or two positions; sometimes not at all. Bigger rank differences (>3) tend to occur for OEMs with smaller networks; scores will vary more in these cases.

The biggest difference in rank is 6, and it is for OEM 9 in the Service Manager Survey. Compared to OEMs below it, this OEM has more “Very Satisfieds” than “Very Satisfieds” plus “Somewhat Satisfieds”, respectively (which is a good thing). It is also the only case where the scoring makes the difference between “mid-pack” and “bottom of the barrel”. In general, the rank differences are greater in the Service Manager Survey, but at an average of 2.1 vs 1.4 for the Parts Manager Survey, this does not lead to significant shifts in rankings either way.

So, is this it: scoring methodologies do not really matter? Not so fast. Consider this picture:

Obviously, there is a big difference here: to the left is “life”, with its constant change and variability. To the right is … well, not sure what there is and can’t really ask anyone. But at a minimum, there is not much to see, unless you like flat lines. So, what’s the picture created by the different scoring methods?

As we would expect in an industry that is mature, but still very much alive, we don’t quite see flat lines. In fact, the picture created by each scoring method is radically different: both Top Box and NPS show a significant difference between the High and the Low; Top Two Box and Average significantly less so. Look closely at the Average score chart: there is little difference between the eight highest scoring participants, almost half the participants, and it’s as close to a flat line as you can get.

Supply chain folks hate variability (they prefer the flat, steady, predictable lines); research folks love it. Simply put, where there is variability, there is “life”, and the opportunity to learn by figuring out where you stand and where you need to go. Thus, scoring methodologies tend to not significantly affect the ranking of survey participants, but they impact the score differences between participants, with some methodologies making the differences more visible and others, well, almost obliterating them.

Of course, applications do exist for methods with less variability. For instance, you may want to consider the Average or Top Two Box scoring methods if you tie performance targets and compensation elements to the survey scores. This is why we supply all box and raw data reports; participants can pursue the scoring methodology that best fits the purpose.

However, we prefer the Top Box method and use it as our official survey metric in North America. It acts as a “magnifying” glass to more clearly differentiate satisfaction performance. (As discussed, we don’t use NPS, as it is not really a satisfaction measurement.)

There are good reasons for our choice. When someone asks you “How are you doing?” you will most likely say “Good”, without even thinking about it. Most of us will not say “Very good” unless it has been a REALLY good day for us. Of course, this is what we REALLY want … and what we prefer over yet another boring regular, “good”, “flat line” day.

Take a look at the chart below from our 2013 Consumer Sentiment Survey, directed at vehicle owners. It shows the likelihood that a vehicle owner will return to the same dealer for service. The data is segmented by customers’ satisfaction with their most recent service event: 90% of the “Very Satisfieds” are “Very Likely” to go back to the same dealer vs. 63% of the merely “Satisfieds”. 27 points – quite a difference!

Yet, that difference disappears when you combine, in the ”Very Satisfied” and “Satisfied” bars, the “Very Likely” to repurchase with those “Somewhat Likely”; they’re both about 90%. You see, Top Two Box masks a large and important difference of 27 points, as well as the fact that going from “Good” to “Great” matters!

By the way, dealer personnel behave the same way as their customers do; they want to be “delighted” and not merely be “Satisfied”. Let’s take a look at parts manager purchase loyalty, which can be seen as equivalent to consumers’ “Service Repurchase Likelihood” in the prior chart.

The chart below is based on our 2013 Automotive Parts Manager Survey and cuts parts purchase loyalty by overall satisfaction level. There is a substantial difference between the purchase loyalty of “Very Satisfied” vs. “Satisfied” parts managers – almost 1.5 points. Multiply that by tens, hundreds or even billions in parts sales and you’ll see that this seemingly small difference matters … big time! And, because it does matter, our survey participants have decided to report satisfaction scores as “Top Box” (% “Very Satisfied”) instead of “Top Two Box” (% “Very Satisfied” plus % ”Satisfied”).

To summarize, it will become (and has already become) too easy to say you are “satisfied” in a mature industry where big mistakes and huge performance gaps tend to be rare. This becomes even more important as OEM participants take action based on the survey results and drive performance levels and satisfaction scores even higher. “Satisfied” becomes the new “Neutral”.

In other words, Top Box appropriately applies a stricter standard, by only looking at the “Very Satisfieds” who, conveniently, are also most likely the strongest promoters. (Note the similarity between the Top Box and NPS ranks in the tables above.) So, here is our answer to the question we posed above:
“Top Box or Top Two Box … or other metrics – that is the question.

We think 'tis nobler in the survey to score

By looking at Very Satisfieds only.”

No comments: