Dialogue Management Systems

In order to make computer interfaces more human-like, the concept has arisen of agent-based interfaces, in which the user communicates with a virtual character using multimodal dialogue. These Embodied Conversational Agents (ECAs) have traditionally not had dialogue systems properly embedded in them. Companions plans to integrate robust dialogue capability as a core technology for next generation ECAs.

Dialogue Manager Systems (DMS) are complex technologies involving many different types of Natural Language Processing and Artificial Intelligence components, including information extraction, knowledge representation and planning.

Platforms and Dialogue Managers for ECAs

A great deal of work has been done to achieve flexible and robust interaction with compact software agents, and this can be seen as an extension of distributed DM architectures such as Galaxy Communicator (Seneff et al. 1998). These approaches include the agenda-based dialogue management architecture (Rudnicky et al. 1999) and its RavenClaw extension (Bohus and Rudnicky 2003), Queen's Communicator (O'Neill et al. 2003), SesaME (Pakucs 2003) and Jaspis (Turunen et al. 2005).

In these approaches, dialogue management is often implemented using the object-oriented approach. Most importantly, inheritance is used to separate generic dialogue management from domain specific actions. The modular agent-based approach to dialogue management makes it possible to combine the benefits of different dialogue control models, eg state-based dialogue control and frame-based dialogue control.

Similarly, the benefits of alternative dialogue management strategies, such as the system-initiative approach and the mixed-initiative approach can be used together in an adaptive way. For example, in the DUMAS FP5 project, multiple agents were used for the same purpose, making it possible to combine rule-based and machine learning approaches in the multilingual AthosMail email application (Turunen et al. 2004).

Building an integration platform

Companions middleware provides the basis for an integration platform for a multimodal, mobile, computing interface. Of special relevance for the project are middleware architectures that aim to support the development of:

  1. dialogue systems
  2. mobile systems
  3. multimodal interfaces

Existing dialogue architectures include TRINDIKIT (Larsson and Traum 2000), CONVERSE (Levy et al. 1997), WITAS (Lemon et al. 2001), COMIC (Catizone et al. 2003), and AthosMail (Turunen et al. 2004).

In mobile computing there are several interesting platforms such as the Context Toolkit (Dey 2000) and OpenTrek (Sanneblad et al. 2003) for seamless WLAN connectivity, and the MobiTip system (Rudstrsm et al. 2004) for Bluetooth-based communication and distributed collaborative filtering.

For multimodal interfaces, existing architectures have been custom built for specific projects such as Verbmobil, SmartKom and Embassi.

The Trindi project (Larsson and Traum 2000) proposes an architecture and toolkit for building dialogue managers based on an information state and dialogue move engine. The Information state of a dialogue represents the information necessary to distinguish it from other dialogues, representing the cumulative additions from previous actions in the dialogue and motivating future action. It can be seen as an attempt to make a finite state system more plausible as a general architecture for DM, when combined with other components expressing the overall 'information-state' of the system.

Examples of dialogue systems

CONVERSE (Levy et al. 1997) was a machine dialogue system funded by Intelligent Research of London. It had no conventional analysis / DAM / generation division. Its control structure was a simple blackboard system in which the ATNs competed to take control of the generation; these decisions were made numerically based on weights assigned by the closeness of fit of the input to the expected input etc.

WITAS (Lemon et al. 2001) contains a dialogue interface for multimodal conversations to the WITAS robot helicopter. The Dialogue Manager creates and updates an information state corresponding to a notion of dialogue context. Dialogue moves have the effect of updating information states and moves can be initiated by both the operator and the robot.

SmartKom (Alexandersson and Becker 2001) is a multimodal dialogue system that combines, speech, gesture and mimics input and output within an overall DM architecture of a blackboard type. One of the major scientific goals of SmartKom is to design new computational methods for the seamless integration and mutual disambiguation of multimodal input and output on a semantic and pragmatic level.

COMIC (Catizone et al. 2003) was a FP5 IST project which applied research in human-human interaction to human-computer interaction. The application of COMIC was bathroom design and it contained speech and gesture input/output with the use of an avatar to generate facial emotion. The DM in COMIC was designed at the University of Sheffield as a general-purpose dialogue management system, designed so that the domain data is separate from the DAM control mechanism.

The domain data is expressed using Dialogue Action Forms (DAFs) which are augmented transition networks - a series of nodes and their connected arcs containing tests and the corresponding actions. In order to create and modify the DAFs, a GUI editor (DAF editor) was developed. The general purpose nature of the DAM means that it could easily be accommodated to other dialogue systems with a minimum of application specific reorganisation.

Updated: 12 November 2007 15:43 PM