For many years, effective voice-based search technologies have eluded businesses that have tried to introduce next-generation input methods to customers. Confined to basic navigation and so-called “magic words,” speech-based commands have been ineffective and often hard for consumers to use.
The widespread adoption of smartphones and tablets has led to a renewed interest in this genre of technology however, with Apple’s virtual assistant, Siri, progressing beyond basic menu navigation functions and striking a chord with consumers and businesses alike. The market now seems on the cusp of welcoming in a new generation of speech-based capabilities, referred to by some as conversational interfaces, designed to streamline a range of consumer interactions with systems and devices and, in short, allow people to converse with devices as they would with one another.
One of the first fields conversational interfaces are being applied to is the TV industry. Given the sheer volume of entertainment sources and programming now available in the living room, companies are looking to speech or conversational interfaces to tackle the age-old dilemma of helping consumers find something good to watch on TV.
Speaking the viewer’s language
Video is a difficult medium to search, and people examine video content in a unique fashion, combining preferred selections and considerations across cast, plot, and genre, all of which differ depending on the user and their preferences. For conversational interfaces, which are interfaces that simulate natural communication qualities and remove the need to conform to hierarchical menu structures, the technology must understand when a user is drilling into a particular genre in detail, or when they have lost interest and have completely switched topics.
To be successful, conversational interfaces need to encompass a variety of different capabilities, each crucial to success:
• Disambiguation—Natural language technology must understand and interpret the user’s intent. For example, the phonetic sound “Kroos” can be interpreted to apply to Tom Cruise or Penelope Cruz, and the system should be able to understand what the user is looking for in relation to the original query. “City” can apply to Manchester City or Norwich City in a sporting context, so again, the system must learn the user’s preference.
• Statefulness—In the course of a dialogue with a user, the system should be able to maintain context, and understand that people often jump from one item to another. For example, the users could say that they are “in a mood for thrillers,” then jump to “Bond” and then to “old ones.” Ideally, the system should understand these requests, and serve up a series of older James Bond films for the viewer to select from.
• Personalisation—Conversational systems need to understand their users on an individual basis. For example, the system should learn that users based in Manchester who asks “when is the game tonight” want to know about their local team, and if they say “When is the City game” they mean Manchester City.
Taking understanding to the next level
Behind successful conversational interfaces there is excellent search functionality. Search providers have blazed a trail in harnessing new technologies to better provide for their customers. In 2012, Google announced its “Knowledge Graph”, which was designed to understand keywords on a deeper level than ever before and make them more about relationships than simple terms. In 2013, Facebook revealed “Graph Search,” which crawls for results based on the searcher’s friends, content and relationships, as well as wider trends on the site. These technologies have introduced high-quality and relevant search results to consumers everywhere, and have set a benchmark across industries.
In the context of TV, most consumers have viewing patterns that can be mapped to provide highly personalised results to searches. This is more accurate than user-based profile creation or “thumbs up/down” ratings that are both error-prone and do not automatically take into account users’ changing tastes and preferences over time. The ability to make personalisation precise and extremely relevant —what the industry is now terming hyper-personalisation — is correlated to the knowledge graph’s semantic capabilities.
At its core, a quality conversational search engine for entertainment should cover the following aspects:
• Knowledge graph: This makes it possible to map search results to intent, and not simply keywords and search terms. The knowledge graph is:
– A knowledge graph should look at named entities in media, entertainment and geography
– Algorithms to extract, de-duplicate and disambiguate the entities across sources
– Coding capability to build relationships between entities
• Content graph: This comprises the actual content assets that are consumable by users, be they movies, programs, or sporting events. The content graph maps these assets to the knowledge graph, and a single knowledge graph can support multiple content graphs
• Personal graph: Crucial to true conversational systems, the personal graph tunes the conversational system to individuals in order to simulate natural conversations. The personal graph is:
– Based on statistical machine learning
– Able to learn individual behavioural patterns and interests
– Learns how time and device affect recommendations
At the front end of the system, the conversational query engine is required to bind all aspects together. This brings together key algorithms to map and learn linguistic features and provide content discovery features and APIs to customers.
Intuitive search and recommendation
Natural language technology backed with knowledge graphs can provide a revolution in conversational interfaces and TV search and recommendation. Paired with excellent metadata that covers actors and actresses, content synopsis and even famous quotations from films, TV providers can create a second-to-none entertainment brain that offers customers speedy and accurate access to their favourite shows, and similar content that they might enjoy. Conversational search around knowledge graphs is no gimmick – it is set to change the way that people interact with their TV sets.