Search Engines vs. Natural Language Processing: what’s the difference?

IVAs, natural-language understanding and intent recognition, as practiced by Next IT, are still relative newcomers to the marketplace. As such, I’m often asked how the software we create differs from search engines such as Google, Bing, Yahoo! and the like. After all, don’t these companies’ offerings and those of Next IT deliver information to the user in response to typed (or, recently, spoken) inputs? In very simple terms, this is absolutely true. But look a little deeper, and you’ll discover why key differences in design and capability create a wholly different user experience. Next IT’s natural language platform, Alme, represents a fundamental change in philosophy from search providers in terms of how we approach the problem of findability.

Here’s the rundown:

Search takes a content-first approach: given a collection of documents, how can I best understand all the content? Ultimately, this is problematic because the language that content is expressed in (generally written by subject matter experts) is not the same as the language that a user’s need is expressed in. You can see the effects of this in the popularity of forums on Google search results – they become highly relevant because they contain statements expressed in language that relates to the need – so forums are returned on the merit of alignment with user need, and not necessarily the merit of the content supplied.

Another crutch is the way search engines are curated. In enterprise approaches, we “tune” our results through document tagging or by artificially injecting keywords. In the boundless search domain you are starting to see “one best answer solutions,” which are effectively just velocity based. Google does the curation and creates content or “cards” to surface at the top. If I put in the name of a popular sports team, I will see their most recent scores. If I ask, “what was the founding year of Los Angeles Kings”, I get standard search results – as there is simply not a high enough frequency.

These facets make search tools great for research and discovery – but they do not create the shortest path between a user’s need and the resolution.

Taking a user-first rather than a content-first approach, we evaluate the inputs of what users are looking for and codify them (statistically and symbolically) to represent an understanding of intents and needs. Once we understand the intent or need, a content analyst or subject-matter expert (SME) can then go through and say that when a user asks for “X” the most appropriate result is “Y”. That sounds like a bit of work and I won’t deny it – but the value returned is substantial in that just one user has to go through the traditional search or research process and record their steps (generally simply entering a description and a link to the relevant supporting content) and now it is codified and repeatable. Whenever someone comes up with a similar need in the future, returning “Y” becomes consistently repeated and near-effortless relative to executing that first search process.

This also resolves the problems associated with ambiguity and inference. Humans leverage stereotypes, biases and our experiences to build context at a level of complexity not easily replicated in machine algorithms. Let’s use the appearance of IBM’s Watson on Jeopardy as an example: that was an exercise in correlation – not a demonstration of human inference. If I say “who was the inventor of the airplane,” the machine is given an explicit fact: the invention of the airplane. And while a correct answer returned by Watson is certainly a success in IBM’s case, it is simply an exercise in relating attributes of that stated fact. There is no ambiguity standing in the way of deriving who, what date, or at what location this explicit event happened.

If, however, I say, “I need to write on the whiteboard,” as humans, we could infer that you are asking for a marker. But for a machine, this is an exercise in ambiguity and not easily resolved, as the language does not explicitly state the need for a marker. By engaging humans in symbolic modeling within a closed domain, we can create consistent results that are an order of magnitude more tailored to the needs of end users.

Taking this to another level, let’s look at the class of service provided to end users. Search may provide a research vehicle to identify obscure information, or it may effectively answer an obvious or common question, but only an IVA can be extended to a whole other level in terms utility and level of service:

  • Find me an answer faster than I could have (IVA as knowledgeable help)
  • Do something for me that I could have done myself (IVA as helpful rep)
  • Tell me what I need to know when I don’t know to ask (IVA as expert)
  • Do something for me I should have done, but forgot (IVA as assistant)

Technology overall has been continuously evolving away from the old command-line paradigm toward less and less explicit means of human-machine interaction, and what the future holds goes well beyond tools to technology that acts on your behalf—and this is Next IT’s primary area of pursuit.