User Metrics 101

If you are operating any kind of product or service with repeated interactions with users or customers then you should be collecting data about those interactions in some kind of data warehouse.  Interactions in this context means interactions between the user and the product, service or platform (it may also include interactions with other users mediated by the platform, but that is a subset only relevant to certain types of services.)  It is common to refer to such interactions as events for short, because interactions tracked in a data warehouse invariably have a time stamp telling you when it happened.   But this post is not about how to collect that data, rather it’s about how to put that data to good use. Because if you aren’t using the data you are collecting, why are you doing it?  If you haven’t even started collecting data then this post will give you a deep understanding of what kinds of data to collect, but exactly what type of data warehouse and collection method you should use is very application specific and beyond the scope of this post.  Here's how it fits into the overall picture for fighting churn with data:

Chap3MM

The first step in putting the raw data to use in fighting churn is to turn the event data into a set of  measurements that summarize the events and collectively produce a profile of the users’ behaviors. These measurements are often called behavioral metrics or just metrics for short.  Turning related events into measurements is necessary because each event is like one tiny dot in a big picture, and by itself  one event usually doesn’t mean much. But while we need to zoom out from each individual interaction, we are not going to zoom out very far - not yet, anyway.  Each measurement taken will be made individually for each customers, and repeatedly over the lifetime as a customer. That’s because user engagement and churn are dynamic processes for each individual and you will need to watch how those metrics change over subscribers lifetimes in order to fight churn with data.

To people trained in data science, machine learning or statistics call this area “feature engineering”.

The process is illustrated in the drawing below:  For each user you have a series of events. Only the series for a single event is shown: Logins. In typical scenarios there will be many different types of events, and the events can be at any time; for some types of events there can even be multiple events at the same time.  In order to make subscribers comparable, a metric defines fixed time windows and counts (or otherwise measures) the number of events in each window. In the example the windows are defined as consecutive four week periods. These are calculated on the day after the periods so that they observation of events is complete.  For example, on the 29th of January you can calculate the number of logins per subscriber in the four weeks covering January 1st-28th. Then on February 26th you can measure the number of logins for the four weeks from January 29th-February 25th, etc.  

MetricConcept

You are probably wondering why I am showing you measurements calculated over four week periods and not over whole months.  Weeks are important because nearly all human activities follow a weekly cycle and it is very likely that the events in your data warehouse also follow a weekly cycle: If your product is something that people use to work then most events will occur on Monday thru Friday, and Saturday and Sunday will have few events.  On the other hand if your product is something people use for leisure like watching videos or a game then most events will be on Friday thru Sunday and Monday thru Thursday will be slow.  For short observation windows (like a month) it's better to measure metrics over time periods that are an even number of weeks so every calculation is exactly comparable.

A simple SQL select that would calculate one metric like this  on an event table with columns for an account id, an event time and event type is shown below.

with calc_date as (
    select '2017-01-28'::timestamp  as the_date 
) 
select account_id, count(*) as n_login  
from event e inner join calc_date d on
    e.event_time <= d.the_date
    and e.event_time > d.the_date - interval '28 day' 
where e.event_type_id=1 
group by account_id;

However, the simple approach illustrated above has a problem: If you literally follow that approach you would only update the metric once every four weeks which is not very dynamic.  A lot can happen in four weeks, and if you are going to reduce churn you will probably have to move faster.  The drawing below illustrates the solution: Repeating the four week measurements at more frequent intervals, in this case weekly.  As shown, the resulting four week windows overlap. You can also see that the measurement will gradually track between the monthly measurements made with the simple approach.  For subscriber 1 the first couple of four week non-overlapping measurements were 2, 4, 2.  The overlapping measurements include intermediate points where the value was 3, representing the transition period.

StagMetric

The SQL below demonstrates how to implement the staggered metric concept illustrated above:

with date_vals AS (   
    select i::timestamp as metric_date 
    from generate_series('2017-01-29', '2017-04-16', '7 day'::interval) i 
)
select account_id, metric_date, count(*) as n_login
from event e inner join date_vals d 
    on e.event_time < metric_date
    and e.event_time >= metric_date - interval '28 day' 
where e.event_type_id=18    
group by account_id, metric_date
order by account_id, metric_date;

So far we've only looked at metrics that are simple counts of events, but whenever events have data in additional fields you will probably want to summarize that data in the metric.  The most typical case is when an event has a numeric value associated with it. Some of the most typical cases are:

  • The event has a duration in time, such as the length of a session or playback of some media
  • The event has a monetary value, such as a retail purchase or an overage charge

In such scenarios the most common metrics are either:

  • The total value of all the events
  • The average value per event

Either one of these and many others can be  calculated with very similar SQL's shown above,  and I will save details for the book.  Thats already plenty for this post!  Now you know the basics of calculating behavioral metrics on users when you are Fighting Churn with Data.

With all this discussion of types of events you are probably already wondering which type are the most important, and with good reason - with so much variety it’s going to be important to stay focused.  There is no hard rules for this but we can give some general guidelines to frame the discussion. And just to be clear: Figuring out which events are the most important is one of the main points of doing the analyses described on this site and in the forthcoming book, because it is always different for every product service.   So this is just a preview of things we will look at in depth later.

The bottom line is this: the most important events are the events that are closest to the customer achieving the goal or purpose of the service.   That’s vague but some examples should make it clear:

  • Software products usually have some goal, for example writing documents.  So creating documents is more important than just logging in. In general login events are much less important than the events that are directly involved with achieving the goals.
  • Many B2B software products are used for making money, so if there is any way to measure how much money is likely to be made from the events then those are the most important. For example, if a product is a Customer Relationship Management system (CRM) used to track sales deals, then closed deals and their value is probably the most important type of event.  Often a system is not that close to the money customers make, so you focus on other key metrics that show the customers are using the system as intended. For example, in cloud services the events that capture computation or data handled on behalf of business customers show how much the businesses are using, and the provider can only hope those are profitable endeavors.
  • For most media services, the purpose is to enjoy the media.  So playing the content is generally important, and more specifically indicators of enjoying the content like watching the whole thing, giving it a like, or sharing it.  But you can never directly measure enjoyment because its a subjective state of the users.
  • For a dating service the purpose is to go on dates, so actual meetings are probably more important than things like searching, viewing profiles or online interactions.  That presents another type of challenge because measuring success on the service is well defined, but the actual meetups may be arranged through means outside of the service such as phone or text.
  • For gaming, the purpose is to have fun.  Just like with Media, subjective feelings are hard to measure; so the most important events may be things like achieving scores and levels, or social interactions with friends.

There are many important caveats that go along with this point and we have noted just a few, but the fact remains:  

TAKEAWAY If you want to fight churn with data, look for events that are close as possible to the value created by the service, even when that value cannot be measured directly

My book and the rest of the posts on this site are all about bringing rigor to this simple intuition.