Featured article

It’s the Golden Age of Natural Language Processing, So Why Can’t Chatbots Solve More Problems?

Advancements in the field of AI and Machine Learning (ML), specifically Natural Language Processing (NLP), are revolutionizing the way we live and interact with machines and each other. Amazon’s Alexa, Apple’s Siri, virtual assistants and chatbots have changed the way we request and receive information. NLP pipelines take unstructured text data and process it into a structured format to extract information, categorize, and derive insights. It means that computers can convert human language into a form that can be streamlined and processed for AI systems. Ultimately, they make communications between humans and machines easier.

How it works

Computers excel in various natural language tasks such as text categorization, speech-to-textgrammar correction, and large-scale analysis. ML algorithms have been used to help make significant progress on specific problems such as translationtext summarizationquestion-answering systems and intent detection and slot filling for task-oriented chatbots.

For example, that grammar plugin many of us use at work, and the voice note app you use while driving to send a text, is all thanks to Machine Learning and Natural Language Processing.  However, as smart as these bots may appear, humans are smarter, for now. When it comes down to it, human language is nuanced and often ambiguous. This presents serious challenges for ML systems.  Training on massive datasets — for example: every Wikipedia article — and creating large language models does not lead to an understanding of language.

In fact, this methodology can perpetuate human falsehoods and misconceptions.

Despite the widespread usage, it’s still unclear if applications that rely on language models, such as generative chatbots, can be safely and effectively released into the wild without human oversight (think the 2014 movie Ex Machina…well not that extreme but the consequences and consideration of these systems should be taken seriously).

Recent advancements in NLP have led to an explosion in innovation

Just within the past decade, technology has evolved immensely and is influencing the customer support ecosystem. With this comes the interesting opportunity to augment  and assist humans during the Customer Experience (CX) process — using insights from the newest models to help guide customer conversations.

The leading drivers for breakthroughs in NLP and the explosion of language models include:

  1. The development of new techniques and algorithms (word embeddings and transformers).
  2. Improvements and greater access to high-end hardware (GPUs and TPUs).
  3. Open-source tools and libraries — such as SpaCyHugging Face, and Rasa — have democratized NLP.

Transformers, or attention-based models, have led to higher performing models on natural language benchmarks and have rapidly inundated the field. Text classifiers, summarizers, and information extractors that leverage language models have outdone previous state of the art results. Greater availability of high-end hardware has also allowed for faster training and iteration. The development of open-source libraries and their supportive ecosystem give practitioners access to cutting-edge technology and allow them to quickly create systems that build on it.

These advancements have led to an avalanche of language models that have the ability to predict words in sequences. Think of Google’s autofill. Models that can predict the next word in a sequence can then be fine-tuned by machine learning practitioners to perform an array of other tasks.

OpenAI’s GPT-3 — a language model that can automatically write text — received a ton of hype this past year.  Beijing Academy of AI’s WuDao 2.0 (a multi-modal AI system) and Google’s Switch Transformers are both considered more powerful models that consist of over 1.6 trillion parameters dwarfing GPT-3’s measly 175 billion parameters. New, larger language models are released at a break-neck pace. When it comes to AI systems, there is no shortage on the market.

So why aren’t chatbots better?

While these language models, and the systems that leverage them, get more complex and more powerful, the question remains: why are these technologies still so frustrating and often wrong? 

Beyond a frustrating interaction with Alexa, some AI system’s mistakes can have drastic consequences. As seen in Figure 1, large language models trained on massive datasets can perpetuate human falsehoods and misconceptions:

Figure 1: TruthfulQA questions with answers from GPT-3-175B with default prompt. Examples illustrate false answers from GPT-3 that mimic human falsehoods and misconceptions. For more information, check out the full study: TruthfulQA: Measuring How Models Mimic Human Falsehoods

The answer is simpler than you think: Natural Language Processing is not Natural Language Understanding. No matter the computational complexity, energy resources and time dedicated to creating larger language models,  this approach will not lead to the ability to derive meaning, grasp context or comprehend. For a deep dive into a more technical explanation of this, refer to Machine Learning Won’t Solve Natural Language Understanding. It’s an eye-opening look into short-comings of NLP.

What does this mean for companies, customer service, and chatbots?

So how does this affect companies, especially those that rely heavily on chatbots? It’s complicated.

The advancements in Natural Language Processing have led to a high level of expectation that chatbots can help deflect and deal with a plethora of client issues. Companies accelerated quickly with their digital business to include chatbots in their customer support stack.

For some, yes, chatbots can be a viable part of their CX solution.  However for most, chatbots are not a one-stop-shop for a  customer service solution.  Furthermore, they can even create blindspots and new problems of their own. Though chatbots are now omnipresent, about half of users would still prefer to communicate with a live agent instead of a chatbot according to research done by technology company Tidio.

In a world that is increasingly digital, automated and virtual, when a customer has a problem, they simply want it to be taken care of swiftly and appropriately… by an actual human. Chatbot vendors can hope to tackle only about 50% of customer inquiries. While chatbots have the potential to reduce easy problems, there is still a remaining portion of conversations that require the assistance of a human agent.

Frustrated customers who are unable to resolve their problem using a chatbot may garner feelings that the company doesn’t want to deal with their issues. They can be left feeling unfulfilled by their experience and unappreciated as a customer.  For those that actually commit to self-service portals and scroll through FAQs, by the time they reach a human, customers will often have increased levels of frustration. Not to mention the gap in information that has been gathered — for instance, a chatbot collecting customer info and then a human CX rep requesting the same information. In these moments, the more prepared the agent is for these potentially contentious conversations (and the more information they have) the more beneficial it is for both the customer and the agent.

Though some companies bet on fully digital and automated solutions, chatbots are not yet there for open-domain chats.  If left unchecked generative models can cause detrimental issues.

Put bluntly, chatbots are not capable of dealing with the variety and nuance of human inquiries. In a best scenario, chatbots have the ability to direct unresolved, and often the most complex issues, to human agents. But this can cause issues, putting into motion a barrage of problems for CX agents to deal with, adding additional tasks to their plate.

So, are we equipping humans with the best tools and support to deal with customer’s problems?

The good news is that advancements in NLP do not have to be fully automated and used in isolation. At Loris, we believe the insights from our newest models can be used to help guide the conversation and augment human communication. Understanding how humans and machines can work together to create the best experience will lead to meaningful progress. Insights derived from our models can be used to help guide conversations and assist, not replace, human communication.

Our software leverages these new technologies and is used to better equip agents to deal with the most difficult problems — ones that bots cannot resolve alone. We strive to constantly improve our system by learning from our users to develop better techniques.

By predicting customer satisfaction and intent in real-time, we make it possible for agents to effectively and appropriately deal with customer problems. Our software guides agent responses in real-time and simplifies rote tasks so they are given more headspace to solve the hardest problems and focus on providing customer value. This is especially poignant at a time when turnover in customer support roles are at an all time high.

Paul R. Daugherty explains in his book, Human + Machine: Reimaging Work in the Age of AI,

“The simple truth is that companies can achieve the largest boosts in performance when humans and machines work together as allies… in order to take advantage of each other’s complementary strengths.” 

While language modeling, machine learning, and AI have greatly progressed, these technologies are still in their infancy when it comes to dealing with the complexities of human problems. Because of this, chatbots cannot be left to their own devices and still need human support. Tech-enabled humans can and should help drive and guide conversational systems to help them learn and improve over time. Companies who realize and strike this balance between humans and technology will dominate customer support, driving better conversations and experiences in the future.

Interested in learning more?

Love having conversation intelligence data but hate the hours of data analysis? 

Save hours, days, or even weeks of painstaking data gathering, filtering, and analyzing. Just “Ask Loris”! See how it works here.