Skip to content

What’s Wrong with My AI? The Importance of a Data Strategy

The answer to what might be wrong with your AI implementation could be a simple one: you may have overlooked the need for a Data Strategy first. For years in the Analytics and Business Intelligence space, there has been a phrase of “Garbage in, Garbage out!” The same holds true for your AI implementations.

The Foundation of AI: Data

First and foremost, AI is built upon data, and the accuracy of that data will determine the outcomes you generate through AI. Not too long ago, I had someone ask me, “Why is it taking so long to get AI up and running? Can’t we just ask ChatGPT everything?”

This question led to an enlightening conversation about the importance of data in AI.

A Conversation on Data and AI

Here is how the conversation unfolded:

Me: How do you think it knew that a picture of a rose was a rose and not a sunflower?
Mr. X: Because it’s AI.
Me: No, because it had to source the information from trusted sources to know what pictures of a rose look like versus what a picture of a sunflower looks like.
Mr. X: Well yeah, it just searched the web.
Me: So you think everything on the web is right and we can just blindly trust it?
Mr.X: Well no, I didn’t say that.
Me: Then how did it know what to trust and not to trust?
Mr.X: I am sure someone told it.
Me: Exactly, because if you allowed it to learn from everything it found then people could purposefully put up sites that mislabeled all flowers to change the outcome of AI. Garbage data in equals garbage data out. Would you want people to put up bad websites that disparaged your software and told them wrong ways to do things? Then have AI read them and give your customers wrong answers?
At this point, you could see a lightbulb go off and the conversation went much smoother. I had the opportunity to walk them through data strategy and making sure the data that is being stored into the Data Lake is verified and correct before we feed AI algorithms and LLM’s to start learning how to answer client questions.


The Pain of Playing Catch Up

What was really causing this client pain is that they didn’t previously have a data strategy. A Data Lake and Data Analytics in their organization were deprioritized for years, and now they are playing catch up with their competition. So, I ask you, do you already have a data strategy? If not, don’t make the mistake of implementing AI into your business on top of what is possibly bad data.


A Real-World Scenario

Just in case you are thinking about not taking my advice, let’s go over a real-world scenario that we were called in to help with.
Company A put a chatbot into their application to help with customer support in answering questions for their clients when they would get stuck using the application. Their strategy was to take product documentation and Zendesk ticket history and feed it into an AI model so that when prompted with a question it could answer the customer if it had been documented or addressed in a previous Zendesk ticket. This sounds simple enough, right! It certainly is simple and a very common use case for AI. However, what Company A forgot was to cleanse and verify the data going into the model.
About 90 days after they went live, the email I received was “Please tell me you can help, our chatbot just told off our customer!” So of course, we jumped in to help and what did we find out? When sending Zendesk data to the AI model, they forgot to filter out the internal comments, and someone asked a question about how to navigate to a feature of their application and the bot responded with “You have to be a F$#%$ idiot”. This was a result of that being said on an internal comment to a ticket where the exact question was previously asked.


Conclusion

Unless you want to tell off your customers I’d suggest making sure that you have a solid data strategy in place to make sure that your data is properly filtered out, cleaned and verified before you go feeding it to an AI model. The fix for this issues was one that was not easy and in the end the decision to retrain the model from scratch was chosen as the right approach.
If you think this example was an isolated one just go and google failed AI or AI gone bad and you will quickly start to see many stories that are similar in nature. Ask yourself, do you feel confident in the data you are about to unleash on AI then implement into critical business processes?
If you answered no, then it’s time to take a step back and first figure out your data strategy.