Intelligent Interfaces with NLP


This literature review studies the field of Natural Language Processing (NLP) as it relates to Human Computer Interaction (HCI). While focusing on its history and applications for designing voice interfaces, it will also explore the computational and ethical challenges as well as opportunities in the areas of health and education.



Logo Design

This logo aims to capture the interactions involved in human-computer voice interfaces. We see a human in the foreground, interacting in a hands-free manner and a voice assistant in the background like an obscure technology we do not yet fully understand. It is clear that voice assistants have a lot of catching up to do to understand human language nuances.


Type

  • Group Project

Team Members

  • Neha Javalagi
  • Cassidy Adamcik
  • Semeon Mesfin

Project Timeline

  • February 2017 - May 2017

My Role

  • Researcher

Techniques

  • Literature Review
  • Evaluation of current technologies
  • Video

Motivation for the project

Natural Language processing for interface design

Natural language in any form, whether written or spoken, is the primary medium of communication between human beings. Interaction between humans constitutes a dynamic interchange of signals carrying information. Generating signals based on the purpose of communication and understanding signals based on features of the interlocutor allow humans to communicate on numerous levels simultaneously. The use of natural language as the primary means of communication across the breadth of human interactions suggests its application to human computer interaction as well.

Today, computers play an important role in every stage of the information life cycle right from collecting data, to storing and transforming it. Hence, it is necessary that we develop systems that can understand and generate natural language to provide an easier access to computing systems for solving problems and making decisions. When users interact with machines in an artificial programming language, there is an added onus to learn how commands are to be expressed for completing the required task. Furthermore, modern graphical user interfaces(GUI) alienate certain groups such as the visually impaired and the technologically challenged groups such as the elderly.(Pierce, 2015) As digital environments have gotten increasingly complex to navigate, screen readers have been rendered ineffective and inefficient. With increased accessibility and intuitive communication, a natural language interface can provide for a better user experience. A natural language interface(NLI) accepts queries or commands in natural language, sends data to a retrieval system and communicates appropriate responses to the query or command given. Translation of natural language statements into required actions to be performed is a challenging task for any NLI system. Latest advancements in NLP can be leveraged for designing better interfaces for improved user experience.


Advancements in NLP

It is the supreme goal of NLP that future machines will have the ability to understand and generate natural language as capably as humans. Research has shown that achieving this goal is far more difficult than was initially thought as the problem involves many components such as understanding nuances in language, prosody analysis, handling dialogues etc. Most of the efforts in Natural Language(NL) interface design till date had thus focussed on simple question answering systems which perform information extraction and retrieval operations using NLP tools. It was clear that knowledge based systems were better than humans at playing chess (IBM’s chess playing computer Deep Blue) or making fact-based decisions but couldn’t be used for imitating other intelligent human behaviour such as speech or image recognition. Gradually with the big data revolution came developments in GPU technology for parallel computations, and deep learning techniques to process this huge amount of data to train Artificial Intelligence(AI). Speech recognition accuracy as well as language understanding and dialog management have seen great improvements(Kelly, 2014). Furthermore, due to structured data available on the semantic web and knowledge bases such as Google’s knowledge graph, search engines can interpret the context and semantics of a query rather than return results based solely on keyword searches. This has facilitated efficient question answering for conversational interfaces. Use of ubiquitous devices which capture contextual data, increased connectivity and availability of cloud computing resources that reduce computing costs are some other technological advancements that have fuelled the rise in designing conversational interfaces using NLP.


Voice Interface Design

Voice interfaces as a vision for the future have been often seen in science fiction. In Star Trek, we see a 24th Century Captain Picard order “Tea. Earl Grey. Hot” in an awkward, almost robotic dialect. In the 21st century, in contrast to this envisioned dictation style interface we are seeing conversational interfaces which support more interactive exchange of dialogue.

In 2011, Apple’s knowledge navigator vision and Semantic web agents vision were realized with the launch of the first Voice Enabled Assistant(VEB), Siri. Today, many large companies in the world have their own personal assistants such as Google’s Google Assistant, Amazon’s Alexa, Microsoft’s Cortana, Facebook’s M, and Baidu’s Duer. These VEBs while providing an innovative user experience, provide these companies an edge in the market by helping them profile their users and customize their services for them. These interfaces are now fluid to allow for “mixed initiative dialogues”(Salgar, 2015). Amazon, Google, Microsoft, Nuance, and SoundHound allow developers to build specialised solutions on top of their conversational platform technology.

Enterprise and/or specialized VEBs assist professionals in their work or customers to get required help. IBM Watson for oncology is one such specialized assistant designed to assist oncologists to take evidence-based decisions about their treatment. Ask Anna, Next IT’s Alme and JetStar’s Ask Jess are examples of customer-facing VEBs that help customers find information.

In the 2013 movie ’Her’ we see Theodore Twombly seeking support from his Operating system ’Samantha’. Later, he refers to their relationship as dating. We see that this is not far from reality today.

However, there is still more work to be done before conversational interfaces achieve a performance close to that of Samantha. Moore(2013) suggests that it is imperative that researchers draw inspiration from other fields of research beyond speechto be better informed about how communication occurs. Other challenges include limitations in understanding natural language, lack of accurate user models and lack of tools and middleware to support research(Zadrozny et al., 2000)

Research Questions

Earlier accounts of human machine interaction considered speech as a medium of transmission of text, where components such as gestures, expressions and gaze were believed to have expressive value but no real semantic value. Historically, speech recognition technology has thought of speech based interfaces to be text querying systems with an additional Speech-to-Text and Text-to-Speech converter. Thus it enabled users to perform tasks such as querying a database which previously required them to type in input or manually read the output. More recent work has considered “human spoken dialogue as a dynamic joint activity where participants collaborate to accomplish a common goal or resolve a coordination problem” (Carston, 1999) Today, in addition to collecting and processing auditory data, conversational interfaces must take on an active observational role to interact with humans more naturally.

While modern interfaces today are designed to possess some of the linguistic characteristics of human interaction, these interfaces fail to capture nuances in language and social interactions.

(Owei, 2000) argues that “the drawbacks of most natural language interfaces to database systems arise due to their weak interpretative power caused by their inability to deal with the nuances in natural language”. The author further stated that “by combining concept based DBQL paradigms with NL approaches, we can enhance the usage of a query interface.”

Another challenge is how to measure user engagement with a voice enabled assistant.

It seems that users stop engaging with VEBs after an initial period of excitement and experimentation. After encountering speech recognition errors or incorrect responses, users tend to revert back to other dependable channels which they are more comfortable with. With the growing number of voice assistants available on the market, it is often confusing for a user to choose. Another challenge in evaluating natural language interfaces is to measure user engagement correctly. In the future we can consider using large data gleaned from interactions to experiment with combinations of different features to define accurate methods for measuring user engagement. Future work can also involve using machine learning techniques used to extract features, feed a model into an intelligent system itself to predict user engagement in the interaction process. Development of interfaces with abilities such as developing social behaviour and communicating with human beings while being situationally aware implies the need for development of better representations and design of new Artificial Intelligence architecture. In order for an interaction with a machine to be ’natural’, we would need to define the robot of the future. As intelligent devices lack the capability to understand complex social behaviours and interactions , they may not be able to have emotional relationships. Even with cutting edge technology used to design intelligent systems, we still see these systems reinforce gender stereotypes. We see a disparity in the number of male voiced assistants as against female voiced assistants.

Does this reinfore the stereotype that an assistant’s job is a woman’s?

Giving natural language interfaces a gender seems imperative because it is crucial for humans to connect with them in a natural context. Humans use social cues in their interactions with intelligent devices. However, care must be taken that these do not lead to reinforcement of stereotypes or make room for incorrect assumptions regarding the purpose of interaction (Marchetti-Bowick, 2009).


Theme 1: NLP For mental health disorder diagnosis

Research Questions

Theme 2: NLP For language learning technologies

Research Questions

Final Report will be uploaded soon !