When it comes to Real Money, in AI we DON’T Trust


A recent survey I ran shows serious limits to what AI can do in the near term: The survey found that fewer than 50% of respondents would trust an AI if there was $1,000 or more at stake based on the decisions of the AI.  So trust in AI disappears fast when there is serious money at stake. In this post I claim this lack of trust is reasonable, and it will limit how AI will be adopted in companies that sell to other businesses (business-to-business companies, or B2B): B2B companies may build AI into their products, but they are unlikely to use it in their dealings with their own customers.  With all of the news about advances in AI (Artificial Intelligence), some pundits seem certain there is no limit to what AI can do.  I don’t know what will happen in the long run, but these limits will probably be in place until the technology advances further.

Before I show you the results and discuss what it means, please take the  survey yourself (it takes 2 minutes).

The Limits of Trust in AI

If you didn’t take the survey, here is the question:

Imagine you have the choice to use an Artificial Intelligence (AI) system to make a series of decisions that would impact you or your company financially.  You don’t know how the AI works, but you know it has worked in the past. If you don’t use the AI you have to come up with another way to make the decisions, or give up the possible gain.   Suppose there is $1,000,000 at stake over the coming year. Would you trust the AI in the following situations? Pick the maximum risk you would take on a single decision made by the AI…

The question is followed by answers like the following:

  • I would trust the AI if there was 1¢ at stake per decision, and the AI will make 100,000,000 decisions over the next year
  • I would trust the AI if there was $100,000 at stake in one decision, and the AI will make  10 decisions over the next year
  • etc.

The question is designed to represent the dilemma faced by potential users who don’t understand AI fully, but are asked to consider using AI to implement some policy or process.  It might be fine to trust an AI if you are asking a personal assistant to recommend a movie or tell you a joke, but what about when you have some real skin in the game? The question is meant to probe the limit of trust as a function of the money at stake based on each decision, but keeping the total amount at risk in the process controlled (more about this below.)

This graph below shows the main result of the survey, presented as what percent of the respondents would be comfortable trusting an AI with a given amount at stake.  Most people would trust an AI if the value at stake on each decision was $100 or less. But the trust starts dropping fast after $100 at stake per decision.


Low vs. High Value = High vs. Low Frequency

The fixed total amount at stake in the survey scenario creates a tradeoff between the value at stake on each decision and the number of decisions the AI has to make.  This reflects a realistic and important dimension in scenarios where AI might be deployed: A typical Consumer product company (also known as Business-to-Consumer or B2C) has a lot of customers that are individually low value, while a provider of products for other businesses (Business-to-Business or B2B) typically has many fewer customers, but each customer is worth a lot more, potentially millions of dollars per year from individual customers.

The following are a few of the type of scenarios that AI could be used for, and in each case the B2C scenario represents at most a few hundred dollar at stake when the AI makes a decision, while for B2B the value at stake maybe tens of thousands or more…

  • Recommending products to customers automatically: Products could be low value consumer sales on a retail web site, or a high value CPQ (Configure Price Quote) scenario in Enterprise product sales
  • Recommending the next best action for a representative to take with a customer when they call in for support
  • Flagging leads in marketing and sales data

If you are dealing with Small and Medium Businesses or SMB customers the situation is somewhere in between.   This raises the question: Is people’s trust in AI influenced more by the value at stake or the frequency of decision making?   That why there was also a second question in the survey:

  • Suppose there is $10,000,000 at stake.  Does it change your answer?

For every response, the value quoted is the same as in the first question, but the number of decisions is multiplied by ten.  I asked this question to see if people perceive a difference in AI trustworthiness when there are more or less decisions to be made at the same value per decision.

It’s Not About the Money : Low Frequency AI Is Less Trustworthy

The number of decisions to be made by an AI is an important aspect of an AI system.  But people who are not AI professionals may not appreciate it: The frequency of the decision is telling you (ballpark) how much data is probably available to train the system.  If you propose a system to automate one million decisions per year, you probably have on the order of a million data samples with which to train the system. If you are trying to automate a process that involves only one thousand decisions per year, you probably only have a few thousand examples of the decision with which to train an AI system.

That’s important because modern AI systems are hungry for data to use for training: Preferably these systems train on Big Data, in the range of millions of examples.   Caveat: How much data you need depends on the problem, and many AI and data science professionals may disagree with exactly where I drew the big data distinction. But most would agree with the bottom line:  How much data you have to train a system matters a lot, and trying to go with too little data can give poor and misleading results. Without enough data a system could perform brilliantly in a concocted test scenario but then be an epic failure when it meets reality running “out of sample”.  So the frequency of decision making and the resulting size of available data for training is an important measure of any AI system, and the only reasonable view of AI that makes rare decisions is it is not trustworthy.

Of course, where you draw the line is still a subjective decision: Now I’ll tell you my answer to the survey:  For the first scenario, $1M total at risk I would trust an AI if there was $100 at risk per decision and 10,000 decisions; and for the second question, $10M total at risk, I would trust the AI with $1,000 also on 10,000 decisions.  I have trust for a system with the same number of decisions, 10,000, but I’m not too bothered by the precise value at risk on each decision: If the AI system had in the tens of thousands of examples to train on, then I’m pretty confident it will work as expected.   But if the AI was trained on less than 10,000 examples then my trust is a lot lower.

A lot of people saw $100 as a limit like I did, but it doesn’t look like many survey takers share my reasoning.   Based on the survey, it seems like a small number of people do see the frequency of the decision as any important factor in how much to trust an AI, but not many,   This resulted in a slight flattening of the response curve for the $10M value at stake answer. (If everyone had answered the question as I did then the $10M curve would exactly track the $1M curve but shifted to the right by one multiple of 10 on the logarithmic scale.)


There are other obvious reasons why AI makes more sense on low value high frequency decisions and not on high value low frequency decisions:

  1. If something is going wrong with a high frequency low value decision, you can detect that the system is not working at low cost.  If you only make ten $100,000 decisions a year, you might have lost hundreds of thousands of dollars before noticing some kind of system error.
  2. If you don’t have a high volume of decisions to make, why not just have a person make the decisions?  If you have to make a small number of decisions a year and the value of making those decisions right is $1M you can probably pay an expert a six figure salary to get the job done.

AI Is Not The Only Use For Data

So am I saying that if you have a situation of  high value decisions that are made infrequently, then you can’t take advantage of Data to help make those decisions?  No, not at all! But I did just spend nearly 2,000 words arguing that AI is not trustworthy unless there are a lot of low value decisions to make?  Yes that’s true, but there are plenty of things you can do with data other than AI. In all the recent hype around AI many people and companies are forgetting that the scientific use of data encompasses multiple approaches and AI is just one of them.  If your process involves a lower frequency of decision making, then you can and should use data to assist the human decision makers, rather than looking for an AI to take over the human role. The original data science is the good old subject know as statistics: That’s all about testing hypotheses with scientific methods in order to make the world understandable to humans and it includes methods explicitly designed to make the best possible decisions based on small data.

A perfect example of this is in my current role as Chief Data Scientist at Zuora:  Zuora is a B2B Software enterprise software company where an average account spends $200,000 a year with us.  We use loads of our data about our customers to monitor how well our customers are using our products and we proactively use data to spot risks where customers need help and opportunities to expand when our customers are doing well.  But does that mean we take the data and put our $200,000 customers on the line with a chatbot? Heck no way! We arm our human account managers with appropriate data and metrics to understand their customers and then have them pick up the phone the old fashioned way.  For $200,000 a year our customers deserve no less.

Is there nowhere to use AI in a B2B enterprise?  That’s definitely not true either – for B2B companies there are opportunities to use AI in the product offering itself, it not in how you manage your customers.  For example we are developing an AI to automate credit card payment processing on behalf of our customers as a product feature (meaning, it’s part of our product that our customers use to submit their own credit card payments – our B2B customers mostly don’t pay us by credit card.)  My point is that AI is just one tool in the data scientists toolkit, and you just don’t want to go hammering screws.

Okay, I think you get the idea.  Next time you are thinking about if a process or decision can be automated with AI, remember to ask yourself: How much is at stake each time the AI is going to make a decision?  And how often would the AI have to make that decision? The frequency of decision making is directly related to the amount of data you will probably have available to train the AI, and makes a big impact on the quality of the results.  If AI doesn’t seem like a fit there are plenty of other ways you can use data! The issues are especially relevant at B2B (Business to Business) companies when they deal with their customers, where the value is likely to be high while the volume of data available is often low.