How to reduce churn with data science

The answer will surprise you. Over the last few years I have analyzed churn at more than 50 companies, in my roles as Chief Data Scientist at Zuora*. What I have found is that in order to reduce churn with data science you don’t use the latest AI technology… What works best for reducing churn is a return to a traditional notion of “science”: test some hypotheses about subscribers and churn. Then communicate the results to the people who actually prevent churn. Those people are product and content creators, marketers and customer representatives.

To many data scientists this may come as a surprise. That’s because we are in a time when the hype cycle is completely infatuated with neural networks and other black box products and technologies. Being a data scientist is practically synonymous with deploying black box predictive systems. But AI is not advanced to the point where it can perform high value / high risk tasks like calling customers and designing email campaigns: By and large, those jobs still belongs to account managers, marketers, and customer support and success representatives.

Read on to learn more about what I have found that really works, or follow me…

*Zuora is a comprehensive subscription management platform and newly public (NYSE: ZUO). Silicon Valley “unicorn” with more than 1,000 customers selling subscription products worldwide.

1. Predicting versus reducing churn

In this post I’m going to share some bad news for all of my fellow data scientists and analysts out there:

1.1 Predicting churn is hard.

Usually (hopefully) churn is rare in comparison to continuation of a subscription, so churn is what you call a rare outcome in data science lingo. As a result, false positives will be common with the best predictive algorithms.

It’s easy to see why predicting churn is hard, and is prone to false positives: Consider your own behavior the last time you unsubscribed from something… You probably were not taking full advantage of the subscription for months. But it took you that long to cancel because you were too busy or uncertain. If a churn warning system was observing your behavior during that time it would have flagged you as a risk every month… And been wrong every time!!! Until The right moment to cancel came, that is. But that moment was determined by too many extraneous factors to be predicted with high precision.

1.2 Reducing churn is even harder than predicting churn.

Preventing churn is what’s really hard. If you think about it, every subscription has a cost that must be outweighed by a benefit delivered. If the cost outweighs the benefit, churn is just a matter of time. The cost may be a concrete amount you pay. Or the cost may be the attention cost of a subscription to a free service such as a game or YouTube channel (is it worth the space it takes in your subscription feed?) This means that in order to prevent churn in a long term and reliable way a company must actually move the needle on benefit delivered. Or reduce the cost incurred from using the service. This can be harder than getting people to sign up in the first place, because now they know what the product is actually like….

No Silver Bullets to Reduce Churn

People have often asked me for “silver bullets” to reduce churn, and here’s the bad news: There are no silver bullets to reduce churn. That is, if by a silver bullet you mean a low cost and reliable way that always works. In the words of the great startup CEO and venture capitalist Ben Horowitz, “There are no silver bullets for this, only lead bullets.” Meaning you have to do the hard work of increasing the value you provide to subscribers. Either that or reduce the cost, which is the nuclear option for a paid service. Revenue churn or down sells may be better than complete and total churn, but its still churn. (The downsell is a “diamond bullet” against churn : it always works, but you can’t afford it…)

2. Reducing Churn is a human job

There have been remarkable advances in AI and data science in the past years, but AI still can’t do a lot of things (see my post on Lack of trust in AI for more discussion on that thread…). For the most part actually reducing churn is something that has to be done by people. People who either a) make the product, service or content; or b) interact with customers. The picture below summarizes the various functions of a typical product work to reduce churn.

4 Ways to Reduce Churn With Data — Four different functions all contribution to reducing churn

Roles and functions that reduce churn

It varies by the type of subscription offering and organization, but generally speaking these are the people who prevent churn:

Product managers, content creators and producers reduce churn by making changes to product features or content offerings to maximize stickiness
Marketers reduce churn by crafting effective mass communication that directs subscribers to the stickiest content and features
Customer success and support representatives prevent churn by making sure customers adopt a product and help them if they can’t
Account managers are generally the last resort in stopping churn, assuming the product or service costs something: they are the one who can actually reduce the price or change the subscription terms.

From the point of view of the data scientist or analyst these are the “customers” or “users” of the data analysis. At small organizations these may all be the same people or just one person. But that doesn’t change the question: what can data science do to really help people do these tasks?

3. The real role of data science in reducing churn

As a result of all these reasons, the data science needed to reduce churn is not the kind of black box AI algorithms that get most the attention in the media nowadays. Instead the real deal is more of a traditional scientific and statistical analysis.

Predictions of churn can be useful. But not unless the prediction is the extension of a knowledge transfer from the data scientist or analyst to the product and customer teams.

So a data scientist working to help reduce churn needs to act more like a social scientist or economist than a computer scientist. The data scientist needs to test specific understandable hypotheses about the causes of churn. For example, what content is stickiest or which behavioral metrics are most closely aligned to value attainment. Many of these hypotheses should come from the product and customers teams. But a good data scientist should be able to guide the process, challenge assumptions, and uncover some surprises. Then all of this has to be translated into knowledge that actually helps the real churn fighters do their job.

The Wall Street Comparison

This point of view is actually well known to people who invest large portfolios on Wall Street. No one trusts long term high value investments to black box predictive systems.* If you have to move money in and out of large positions it means long investment horizon and high transaction cost. Statistical methods are used to verify and quantify hypotheses that the decision makers already have about the markets. But not to make predictions the decision makers cannot see the reasoning behind.

Likewise for churn prevention! The value at stake is high – your company’s survival depends on it. And the cost of interventions can be high too. Poorly planned or executed interventions to prevent churns can be disastrous.

* It is common to use black box AI for high frequency systems making small trades that complete in seconds or less. In that scenario it is easier to halt a failing algorithm and course correct before much damage has been done.

4. The Mindset needed to Reduce Churn

I point all of this out only because there is so much hype around machine learning and black box AI technologies these days. Some people may not realize how inappropriate these approaches are for churn prevention. What is a data scientist to do?

4.1 Leave the Kaggle Mindset Behind

Data scientists and analysts have to stop thinking that accuracy on a predictive problem is the only metric that matters. This is a common attitude for academics developing algorithms on fixed benchmark databases. And this attitude is amplified by competitions like Kaggle. However, this approach falls flat when the problem is a business decision with high stakes. Old fashioned hypothesis testing is the way to start.

4.2 Listening to the Business

Data scientists need to ask the business stakeholders what they are really interested in achieving. And just as importantly, find out what hypotheses the business already has about the data they work with. The prior beliefs of people with deep domain knowledge is worth much more than any algorithm!

4.3 Talking to the Business

Data scientists need to answer the questions the business asks, not just apply algorithms. And the answers need to be in terms the business can understand. Black box models are usually disqualified, and a lot of statistical jargon also has to go. Teach the business the most important findings with simple charts or cohort analyses that they can reproduce in Excel. Once the data scientist has gained the confidence of the business there is room for more advanced approaches, but it has to start with a solid foundation.

For more details check out the early access edition of my book…