It’s time to do what you came here for: Understand why your customers churn! And what keeps them engaged. You may think that in this post I'm going to dive into some serious statistics or machine learning, or brag about the latest Deep Learning algorithm I'm using. Actually, I will demonstrate a simple but robust and powerful analytic technique that I call behavioral cohorts. This will allow you to investigate the real impact of behaviors that may be related to churn. This is a versatile technique that you can use to derive valuable insights from your churn related data. Also, is easily understood by anyone in your organization. I will demonstrate this result with examples from case studies to show what some typical case studies.
Why Behavioral Cohorts Help Understand Churn
The Concept Behind Behavioral Cohorts
The most basic hypotheses of any churn investigation is that people who are using the product are less likely to churn than people who are using the product a little bit or not at all. Cohort analysis is great for testing metrics for their relationship to churn, as illustrated in the sketch below. The idea is this: If customers who are more active on a product churn less, then a group of active customers should have a lower churn rate than a group of inactive customers. So you can check this hypothesis by dividing the customers into groups based on their level of activity. Then check the churn rates in each one of the groups. If an activity is related to lower churn then you will find that the churn rate on the most active group is the lowest. And a less active group should have a higher churn rate, and the least active group should have the highest churn rate. That’s the ideal scenario shown below for just three groups..
Concept in Action
Next consider how a cohort analysis is going to work in practice. Start by assuming you have a data set created like the one described in my previous post, How To Observe Churn. The process of making a behavioral cohort analysis on a single metric, illustrated in the next sketch, is as follows:
- Start from a complete data set that has observations of customers including the metric of interest, and whether or not the customers churn. The data starts out sorted by date and by account id.
- Take only the metric and the variable representing churn or non-churn, then sort those observations by the metric. The identity of the accounts and what date the observation was made is ignored for the rest of the cohort analysis.
- Group the observations into the cohorts by dividing the observations into a pre-selected number of equal size groups. In real cohort analysis you typically use ten cohorts, so each cohort contains 10% of the data. (In the simple example above and below only three cohorts are used.) Note that you do not decide in advance where the boundaries of the cohorts ought to be. The boundaries between the cohorts are a result of the analysis.
- For each cohort, make two calculations:
- The average value of the metric for all observations in the cohort
- The percentage of churns in the cohort observations
- Plot the average metric values and churn rates, with the average metric on the x-axis and the churn rate on the y-axis.
By making plots like those its easy to understand how churn relates to different levels of the behavior.
Churn Cohort Analysis in Python
This section shows the code that will actually perform the analysis.
Now lets look at how to do a cohort analysis using Python with Pandas data frames. See the code block below, and you can check out the source code from my github repository for the book. The function has the following inputs:
- data_set_path : A path to a data set saved in a file, given by a string variable
- var_to_plot : The name of a metric to make the cohort plot for, given by a string variable
- ncohort : The number of cohorts to use, given by an integer variable
Given these inputs, the following are main steps taken to create a cohort plot:
- Load the data set into a Pandas DataFrame object, including setting the DataFrame index.
- Use the DataFrame member function qcut to divide the observations into cohorts. qcut is short for Quantile-based discretization. This function returns a series that are integers giving the group assignments.
- Calculate the average metric and the average churn rate using the DataFrame function groupby. The qcut result, the series of group identifiers, is the parameter.
- Make a new DataFrame from the averages and churn rates
- Plot the result using matplotlib.pyplot and add appropriate labeling before saving
This procedure has an important difference from how we solved the example problem in the last section. The algorithm sorts the data, forms the cohorts and calculates the averages. The code relies on Pandas DataFrame.qcut and DataFrame.groupby functions. “qcut” is short for Quantile-based discretization. Thats a technical term for the kind of cohort groups that we are making, drawing explicitly on the notion of a quantile. A quantile is a value that results as a dividing point when data is divided into equal groups. A percentile is a quantile when the data is divided into ten groups, each of the groups containing 10% of the data
So the first percentile is the value of the metric that divides between the first ten percent of the data and the second ten percent of the data. The second percentile is the value of the metric that divides between the second ten percent of the data and the third ten percent of the data, etc. In the mathematical context discrete means separate, or not continuous. The groups are discrete in the sense that membership in them is all or nothing (not discrete in the sense of something secretive or hidden.) So this function is named "quantile based discretization" because the data is divided into discrete groups by the values of the quantiles.
Case Studies in Understanding Churn
Below are some real case study results using this technique. First, here's an important point that holds for all of the churn cohort case studies : The plot does not show the actual churn rates as percentages on the Y-axis. Instead the Y-axis is unlabeled, and the churn rate is described as “Relative”. The reason for the omission of the actual churn rates is to protect the privacy and business interests of the companies in the case studies. The actual churn rate is a very strategic piece of information that most companies guard closely! But you can still see the significance of the difference in churn between cohorts because the bottom of the cohort plots is always fixed at zero percent churn. That way, the distance of points from the bottom of the chart still shows the true relative churn rates of the cohorts.
The first example of a cohort churn analysis from a real case study is below. Broadly is an online service which helps small and medium businesses (SMB's) manage their online presence including reviews. The case study cohort analysis shows churn in behavioral cohorts for Broadly’s customers based on the number of reviews updated per month, an important event for Broadly's customers. Using the cohort plot, its easy to understand that this behavior is strongly related to churn!
In the figure, the churn rate is highest in the first cohort and then falls over the first five cohorts. The churn rate in the top three cohorts (on the right of the plot, with the highest metric values) is around one half of the churn rate in the bottom cohort. You can tell that the churn in the top cohort is around one half the churn in the bottom cohort because it is approximately half the distance to the bottom of the graph. Another point worth noting is that the reduction in churn rate occurs between the cohorts that have around zero review updates per month and four review updates per month.
Klipfolio is a Software as a Service (SaaS) company that allows businesses to create online dashboards. Another example of a behavioral cohort churn case study is shown in below for Klipfolio. The figure below shows a case study in behavioral cohort churn analysis using the metric Dashboard Edits Per Month. Like in the example above, churn rates are shown on a relative scale with the bottom of the plot fixed to zero churn. In this case, the churn rate of the top cohorts is a small fraction of the churn rates of the bottom cohorts: around one tenth the churn rate!
Versature provides telecommunication services for businesses. The figure below shows another behavioral cohort churn example using Versature’s metric Total Local Call time per month. This is another fairly typical relationship between an important behavioral metric and churn. The cohorts with more than around 2500 local call time per month churn at around one third the rate of the bottom cohort that makes (practically) zero calls. The reduction in churn rate happens between around zero and 2500. Then the churn rate seems to increase slightly, although the increase is not very significant.
Understand Churn: Its not about the prediction.
As you can see, cohort analysis can give really great insights into what keeps your customers engaged. The great thing about it is that it is easily understood by anyone - the analysis can easily be reproduced in a spreadsheet. That's important because as I explained in my previous post, Fighting Churn with Data: How it Works, in order to really reduce churn you need everyone in your company to understand what causes churn. If you use more complex data science methods like a regression or neural network you might be able to predict churn with your algorithm, but you will have a hard time making it useful for your organization.
For more details, check out the book! Its now available in the electronic early access edition: https://www.manning.com/books/fighting-churn-with-data