Salesforce on Thursday announced the general availability of its chatbot for businesses, Einstein Copilot.
Salesforce executives say Einstein Copilot is a lot less likely than other AI chatbots to hallucinate, or generate false or nonsensical information — something that other chatbots from Google, Meta, Anthropic, and OpenAI have struggled to overcome.
“They can be very confident liars,” Patrick Stokes, Salesforce executive vice president of product marketing, said of AI chatbots during a keynote at Salesforce World Tour NYC on Thursday.
Einstein Copilot is different, Stokes said, because it uses a business’s own data from both spreadsheets and written documents in all the apps they’re stored on, whether it’s Salesforce’s own platform, Google Cloud, Amazon Web Services, Snowflake, or other data warehouses.
The chatbot is designed as sort of an intermediary between a business, its private data, and large language models (LLMs) such as OpenAI’s GPT-4 and Google’s Gemini. Employees can put in queries like “what next step should I take to respond to this customer complaint,” and Einstein will pull in the business’s relevant data from Salesforce or other cloud services. It then will attach that data to the initial query to send to an LLM, which will generate a response.
Salesforce’s new chatbot also comes with a protective layer so that the LLMs it sends prompts to can’t retain a business’s data.
In a follow-up interview with Quartz, Stokes explained more about why Einstein Copilot is less likely to hallucinate than other chatbots. “Before we send the question over to the LLM, we’re gonna go source the data,” he said, adding that, “I don’t think we will ever completely prevent hallucinations.”
For that reason, the chatbot comes with a hallucination detection feature. It also gathers real-time feedback from Salesforce’s customers so it can flag system weaknesses to administrators.
AI hallucinations will always happen
Stokes said that envisioning a world with no AI hallucinations is as “silly” as imagining a world where computer networks are totally unhackable.
“There’s always going to be a way in. I think that’s true with AI as well,” he said. “But what we can do is do everything that we possibly can to make sure that we are building transparent technology that can surface when that happens.”
Salesforce’s chief marketing officer Ariel Kelmen contended. “What’s funny is LLMs inherently were built to hallucinate,” he said. “That’s how they work. They have imagination.”
A New York Times report last year found that the rate of hallucinations for AI systems was about 5% for Meta, up to 8% for Anthropic, 3% for OpenAI, and up to 27% for Google PaLM.
Chatbots “hallucinate” when they don’t have the necessary training data to answer a question, but still generate a response that looks like a fact. Hallucinations can be caused by different factors such as inaccurate or biased training data and overfitting, which is when an algorithm can’t make predictions or conclusions from other data than what it was trained on.
Hallucinations are currently one of the biggest issues with generative AI models — and they’re not exactly easy to solve. Because AI models are trained on massive sets of data, it can make it difficult to find specific problems in the data. Sometimes, the data used to train AI models is inaccurate anyway, because it comes from places like Reddit.
That’s where Salesforce says its chatbot will be different. It’s still early days, though, and only time will tell which AI chatbot is the least delusional.