AI failures and how to avoid them are a recurring theme of these posts. In a recent article called 4 things VCs get wrong about AI, Aible CEO Arijit Sengupta provides one reason: AI is not SaaS, and we cannot directly apply lessons learned from SaaS successes to AI Projects. 

I do like Arijit’s contrast with SaaS metrics.  His strongest point addresses what you can’t learn from AI pilots. In particular, AI pilots are different from SaaS pilots, which are more likely to remain successful as they scale out across the company. 

The first reason is that the AI system success may be highly sensitive to the context of the pilot, driving greater risk as you roll it out. The data in an AI system is typically quite different between pilot and larger-scale roll-out.

Another risk comes from the fact that pilot users are different as well: they are often the most tech-receptive and willing to add knowledge, curate data, and partner with the AI system to get the right results. 

A third way in which AI pilots are specially challenged involves the resources required to for AI success. Typical SaaS systems are typically web-based database applications. In contrast, successful AI deployment center around the exact data on which they are changed. The more and better curated that this data is, the better the results tend to be. The up-front resources needed to assemble this data are typically very high. And much of the data you need will come as feedback collected during the actual deployment of the system at scale. 

For these reasons, it’s very hard for an AI pilot to demonstrate the kind of success that would come from a fully-deployed production system. 

How to address greater AI pilot risks?

This is, at heart, is a chicken-and-egg problem for AI product development and sales: given that we have limited information about how the pilot will perform, how can we decide whether to invest in it, which is the only way that we can know for sure? 

I introduced this concept in an earlier post, “How High is the ‘Useful’ Accuracy Bar for This Application?”, where I explained that, before starting an AI project, we need to ask what quality or accuracy is required, what have we demonstrated so far, and what needs to be done to achieve the required quality to be successful.

These questions can be answered, imperfectly but with value, in an AI pilot. So if, for example, our pilot achieves 70% accuracy on a small set of data and with an initial set of features, model sophistication, and knowledge, this gives us limited, but not perfect information as to how it will do in production. This is fundamentally different than the SaaS software situation, where scale-out is simply “turning the crank” – most of the risk is eliminated during the pilot. For AI, the risk remains.

This risk goes in two directions. First, we may be fooled into thinking our pilot is good enough if, say, 70% crosses our bar for success. Second, we may be fooled into thinking that the AI isn’t good enough when it is actually promising: if performance is below par, this is not necessarily a good reason to hold off on an investment. I’ve found that many successful AI projects tend to require long-term investments, patience, and iteration to gather better and better data before they reach their full potential.

My advice is to be practical: a good pilot should reduce risks within a time box of limited time and effort.

It’s also important to be rigorous about how the AI system will fit into the organization’s workflow and decision processes, and how its performance translates into business outcomes. 

I’ve found that a little effort along these lines goes a long way. Sometimes we learn, for example, that even a 40% performance can produce big value, so a pilot that’s at 60% can tolerate a big reduction upon deployment and still be success. Or we might learn that it needs to have a very high performance that our best analysis says is unachievable, and we can avoid wasted effort.