OMAR HUMAN PERFORMANCE MODELING IN A DECISION SUPPORT EXPERIMENT
10 Moulton Street
Cambridge, MA 02138
Human performance models that simulate
the multiple task behaviors of the operators of complex systems are now
being developed that can, with appropriate discretion, be used to complement
the human players in real-world-like simulation environments. We have developed
and used human performance models for an air traffic control simulation
that was the basis for a decision support system experiment with human
subjects. The experiment is briefly described and the roles played by the
human performance models for air traffic controllers and flight crews are
discussed. The theory that forms the foundation for the development of
the human performance models, and the Operator Model Architecture developed
to create the models are presented. Future directions for research based,
in part, on the experiment results are outlined.
While the theory supporting the development
of human performance models is in little more than its infancy the fidelity
of the models themselves has reached the point that, if appropriate discretion
is used, they may be employed in a simulation environment to support experiments
examining important human factors issues. The models can be particularly
useful in building real-world-like simulation environments in which they
can provide the additional "human players" that it would otherwise be impractical
to include. In particular, human performance models played an important
role in a decision support experiment previously reported on by MacMillan,
Deutsch, and Young (1997). The Operator Model Architecture (OMAR) was used
in the development of the air traffic controller and aircraft pilot models
for the decision support experiment. As with the human subjects, the human
performance models were capable of goal-directed behaviors, responsive
to impinging events, and in general, exhibited human-like multiple task
behaviors. EPIC (Meyer & Kieras, 1997), SOAR (Laird, Newell, &
Rosenbloom, 1987), and MIDAS (Corker & Smith, 1993) are representative
of similar frameworks in which the modeling of multiple task behaviors
is being investigated. The roles of the human performance models in the
experiment, situated in an air traffic control domain, are discussed. The
theoretical foundations for the OMAR models are outlined briefly, important
differences from other modeling approaches are highlighted, and the relevant
features of the OMAR (Deutsch & Adams, 1995; Freeman, 1997) framework
used to construct the model are examined. Finally, a direction for future
research in human performance modeling based, in part, on the results of
the decision support experiment is suggested.
A DECISION SUPPORT EXPERIMENT IN AN AIR TRAFFIC CONTROL DOMAIN
Decision support systems can be expected to play a larger and larger role in the operation of complex systems. Last year MacMillan et al. (1997) reported on an experiment that examined two approaches to the design of a decision support system for use in an air traffic control setting. In that experiment, three conditions were developed an unaided condition, a status display condition, and a priority display condition. Subjects were required to track multiple aircraft on a synthetic radar screen, and generate and respond to the messages necessary to accomplish the hand-off of aircraft to or from neighboring sectors in an appropriate timeframe. Subjects were required to obtain permission for an outbound aircraft to enter an adjacent sector from the controller for that sector and then convey a directive to the aircraft to enter the sector. In a similar manner, the subjects had to be prepared to accept inbound aircraft from neighboring sectors. The messages were composed and dispatched through a series of mouse gestures. The level of task load was varied by varying the number of aircraft in or entering the subjectís sector during an experiment trial.
In the unaided condition, the subjects had to maintain the status of arriving and departing aircraft either by remembering each aircraftís status or using the message log to refresh their representation of the emerging situation. In the status display condition, all aircraft having a pending action were highlighted and color-coded to identify the pending action category. In the priority condition, the decision support system provided an algorithm that identified the aircraft with the highest priority pending action.
Figure 1. Air Traffic Control Display (Unaided Condition)
In their review of the literature on the cognitive management of multiple tasks, Adams, Tenney, and Pew (1994) suggested a framework for understanding multiple task workload that included, first, the maintenance and updating of the queue of to-be-attended tasks, and second, the resolution of the conflicts between high-priority tasks and planning for the transition points between tasks. The decision support systems provided in the status and priority conditions of the experiment were designed to examine the relative benefits of supporting the subject in the maintenance of the queue of pending tasks and that of the prioritization of pending tasks. The status display condition was designed to support the subject in the management of the queue of pending tasks. The priority display condition was designed to support the subject in prioritizing his or her response to multiple task demands.
The scenario for the experiment made significant demands on the modeling environment that supported it. These demands were rooted primarily in the real-world nature of the experiment. It was the richness of the modeling environment that made it possible to undertake the experiment in a real-world-like setting. The primary requirements were those for the workplace operated by the experiment subject acting as an air traffic controller and the need to provide models for the controllers in adjacent sectors and flight crews for the aircraft. The major workplace components were the synthetic radar screen, the message sending and receiving system, and the decision aids provided to the controllers.
The experiment subject managed the aircraft presented on the synthetic radar presentation of a square sector (see Figure 1) with additional sectors arranged so that there were four adjacent trapezoidal sectors. The adjacent sectors each required an air traffic controller to manage the traffic moving to or from that sector. Radar screen icons for the aircraft in the sector or approaching the sector provided information on the call sign for the aircraft and an indication on the direction of travel for the aircraft. The sector boundaries were displayed for the subjectís sector and those portions of the adjacent sectors on screen. An inner square within the subjectís sector provided the subject with notification of the appropriate point at which to initiate the transfer of an aircraft to an adjacent sector.
The messaging system occupied the right side of the user interface. Messages were composed through a series of mouse gestures with action types (e.g., Transferring AC, Accepting AC) chosen from screen buttons and message subjects and objects (e.g., the ATC or aircraft to which the message was to be directed, the aircraft that was the object of the message) selected from their representations on the screen. The composed messages were available to be verified or corrected before being dispatched. The lower right hand portion of the screen maintained a history of messages transmitted and received enabling the subject to revisit the status of a particular aircraft.
Given our focus on human performance models, it is the role of the controllers in the sectors adjacent to the subject and the flight-crews of the aircraft that are of interest here. The controllers in the adjacent sectors were called upon to carry out the same tasks as the subjects. Human subjects and human performance models had to perform tasks of the same complexity. The flight crews were required to generate and respond to messages using a messaging system similar to that of the controllers. If not giving permission to enter an adjacent sector they had to maintain their aircraft in a holding pattern while awaiting permission to enter the new sector. The flight-crewís system differed only in the repertoire of messages that could be generated. The human performance models for the controllers are of particular interest since they were playing roles identical to the human subjects. They had to assemble and maintain the queue of pending actions and select actions based on their priority.
The OMAR simulation framework supported the development and use of human performance models in a number of important ways. A Concept Editor provided a graphical framework in which to define the objects in the experiment environment. A Procedure Browser provided a graphical view of the structure of the individual goals and procedures that determined model behaviors and network views of the goal and procedure calling and signal passing connectivity. Agent task and event timeline displays made it possible to closely examine model behavior at execution time.
The OMAR graphical user interface (GUI) builder, MIRAGE (Cramer, 1995), provided important support for developing simulation environments in which humans and human performance models may be used interchangeably as operators. User interface constructs, included those as complex as the synthetic radar screen and the messaging system can be operated either by human players or by human performance models. The human performance models can "monitor" aircraft on the radar screen and operate the messaging system to generate air traffic controller messages in the same manner that the human subjects carried out their tasks. The OMAR Recorder, based on the work of Manning (1987), was used to establish the level of event recording to employ during an experiment trial. Human subjects were shadowed by a lightweight agent to facilitate the recording of their actions.
The OMAR simulation framework provided
the capabilities essential to developing and successfully carrying out
the decision support experiment. In particular, human performance models
for air traffic controllers and flight crews capable of human-like multiple
task performance were readily adapted to meet requirements that would otherwise
have been very difficult to provide in a realistic manner.
HUMAN PERFORMANCE MODELING IN OMAR
The operation of complex systems and equipment requires the cognitive management of multiple tasks. Even in the simplified air traffic control experiment discussed here, the progress of aircraft in the airspace must be tracked, their status must be remembered, actions must be prioritized and appropriate messages must be generated for aircraft in the sector and for neighboring controllers, and interrupts in the form of queries from aircraft flight crews and controllers in adjacent sectors must be handled. The interrupts are not unexpected, but rather meet expectations generated in tracking aircraft on the radar screen. Reactive behaviors are determined within the framework of the operatorís goals. In meeting their responsibilities, the air traffic controllers have a significant number of cognitive tasks in process. The scenario creates a situation in which many aircraft must be tracked and the response to demands must be carefully prioritized to achieve acceptable performance.
The desire to use this real world like setting in which to conduct the experiment created a special set of modeling requirements. In particular, the controllers in the adjacent sectors were required players as were the flight crews of the aircraft transiting the sectors. The only practical means to meet these requirements was to provide human performance models for these players. The demands placed on the models for the air traffic controllers and the flight crews were very similar. The discussion that follows focuses on the models for the air traffic controllers.
While meeting the demands of the experiment was a short term objective in developing the air traffic controller models, the principal focus in the development of the models has been on achieving a better understanding of the foundations for human multiple task behavior. The modeling framework provided by OMAR with respect to portraying multiple task behaviors differs from other human performance modeling frameworks in several important respects. Unlike EPIC (Meyer & Kieras, 1997) and SOAR (Laird et al., 1987) it is not rule-based. There is not a fixed time-step with the decision process for all inĖprocess or pending tasks revisited by processing rule sets at each step. Like MIDAS (Corker & Smith 1993), the goals and procedures that combine to represent the execution of a task are explicitly represented. However, MIDAS is also tick-based and employs an explicit decision process in the form of a scheduler invoked at each tick of the simulator clock. EPIC, SOAR and MIDAS each employ a meta-level process that reasons over executing procedures determining which is to control the agentís action in the next time step. There is scant neuropsychological evidence for such a centrally located executive. In the models developed in OMAR special attention has been paid to how task contention might be arbitrated in the absence of a centrally located homuncular executive function.
The motivation for the computational framework designed to model these behaviors was derived primarily from selected research in cognitive neuroscience, experimental psychology, and recent cross-disciplinary work in the theory of consciousness. Neumannís (1987) functional view of attention, and the localization of mental operations in the brain, as put forward by Posner, Petersen, Fox, and Raichle (1988) are important components in this foundation. Taken together, they point to the functional components in task execution as taking place at particular local brain centers with the coordinated operation of several such centers being required to accomplish any given cognitive task. The functional task breakdown among centers is perhaps best understood for visual and auditory processing. The form that the coordination might take is of particular importance in developing a model of behaviors.
In this framework, there is the potential for a significant amount of parallel computation and it is at the same time evident that there are bounds to that parallelism. Recent work in several disciplines has suggested the forms that this parallelism might take. Edelman (1987) speaks of the degeneracy of reentrant nets in which the same functionality might be provided by several different brain centers. In the instance theory of automaticity, Logan (1988) proposes the activation of multiple memory traces as the basis for learned responses to familiar situations. In the early stages of this process, problem solving may go on concurrently with memory retrieval. And Dennett (1991) discusses a Multiple Draft theory of consciousness in which there are always several contending syntheses of perceived events in-process.
In developing OMAR, we have sought to provide a computational framework in which to assemble functional capabilities that operate in parallel subject to appropriate constraints and will exhibit the multiple task behaviors of human operators. The desired behaviors have a combination of proactive and reactive components. That is, the operators have an agenda that they are pursuing, but must also respond to events as they occur. The bounds on what can be accomplished concurrently take several forms. A typical behavior may be to set aside an in-person conversation in order to respond to a telephone call, while at a simpler level, two tasks that require the use of the dominant hand can not both have access to that hand.
The core of an OMAR model is a network of procedures whose signal-driven activation varies in response to events that are channeled to achieve the operatorís goals. The procedures represent an important component of the long-term memory of the operatorówhat that operator knows how to do in the world. The proactive disposition of the operator is set up by the goals of the operator. Each goal is expressed in a plan made up of sub-goals and procedures. The canonical forms that the goals take are important elements of the operatorís long term memory. The currently activated goals represent the operatorís proactive agenda for managing his or her tasks. The outputs of the operatorís sensors, eyes and ears in the current model, impinge on visual or auditory processing procedures in the procedure network. Subsequent signals activate procedures that interpret the sensory inputs and lead to goal-directed responses appropriate to the evolving situation.
As an OMAR operator model is initiated it is the operatorís goal that are initiated. In the air traffic controller model used in the experiment, the controller model "looks" at the radar screen and initiates a series of tasks to attend to the aircraft on the screen. Cognitive tasks include managing the transfer of aircraft to adjacent sectors and anticipating and processing requests from neighboring sectors to accept new aircraft. Important sub-goals address identifying and maintaining an agenda of pending actions and prioritizing the execution of those actionsóthe central focus of the experiment. Additional supporting goals include maintaining communication with neighboring sector controllers and with the aircraft in the sector. As the plans for the operatorís goals are initiated, appropriate sub-goals are invoked, and procedures are executed.
Once through the initialization procedures, a process similar to a controller sitting down at a radar console at the start of a shift, the procedure network has a number of active nodes. Goal nodes determine the on-going proactive actions of the operator. Procedure nodes may be in a wait-state, typical of tasks whose function it is to anticipate and maintain vigilance for future events related to aircraft on the screen. The activated nodes of the network form a pattern matcher with a temporal dimension such that node activation evolves in response to impinging events. The response of active nodes alters the activation of downstream nodes eventually connecting to goal-directed nodes that govern the proactive response to emerging events. The network of activated proactive and reactive nodes link functional capabilities, and in effect, generates the operatorís multiple task behaviors.
As an example, an auditory input, the spoken message of a radio-based conversation, initiates an auditory procedure that collects and briefly remembers the spoken message. Early on in the auditory processing, the auditory procedure generates a signal that, in turn, initiates a non-conscious cognitive procedure that forms the propositional content of the message. And, once again, early on in the execution of cognitive speech/language understanding process, another signal is generated that activates the processes that, taken together, represent the operatorís thoughtful management of the new conversation. The operatorís management of the conversation is governed by one or more of the operatorís goals. In our example it might be the anticipated call from a neighboring controller asking that an approaching aircraft be accepted in the sector. Hence, the procedure network links reactive and proactive behaviors. The onset of the spoken message initiates an auditory process whose signal initiates a cascade of signals activating functional centers, which taken together, have the mix of capabilities to conduct the conversation and relate its content to purposeful goals.
The process of initiating a spoken transaction is a similar one. A thoughtful goal-related cognitive procedure generates the content of the message, a second procedure simulates the generation of the form that the message will take, and a third procedure represents the enunciation of the message. In the air traffic controller environment where communication is by party-line radio, the timing for initiating a transaction is important. An air traffic controller may be awaiting the response from an aircraft on a previous message as the time approaches to initiate another unrelated transaction. Barring unusual circumstances, policy dictates that the in-process transaction be completed before the new transaction is initiated. Within the model, a priority is associated with the set of procedures governing each transaction. By virtue of being active, the in-process transaction has sufficient priority to block the newly formed transaction procedure nominally of the same priority. The two transaction-governing procedures are classified as being in conflict with one another. When one task is in process and the conditions are established such that a second is a candidate to run, the conflict is resolved between the two tasks on the basis of priority. In the case outlined here it is established policy as automatically implemented by the controller that forms the basis for conflict resolution. This is representative of a conflict between thoughtful cognitive tasks that are relatively high in a goal-plan hierarchy. Conflicts can also occur lower in the hierarchy. The conflict may be quite straightforward as in two procedures each requiring visual guidance in task execution and hence each needing the eyes, or each needing the dominant hand for skilled fine motor control in the execution of a task.
On-going tasks determine their own
execution times and run to completion unless another procedure defined
as a competing procedure with greater priority intervenes. The blocked
task may be defined to resume operation at the point of interruption or
at an earlier point in its execution. A thoughtful cognitive act of deciding
on the next action is modeled as just that, another procedure that determines
the action to follow. Importantly, a broad range of thoughtful and non-conscious
decision making is represented without resorting to a central executive
responsible for scheduling future actions.
FUTURE RESEARCH DIRECTIONS
It was possible to conduct the decision support experiment in a real-world-like environment, in large part, through the development of human performance models capable of managing the same mix of tasks and task interruptions that the experiment subjects encountered. However, the human performance model trials were conducted only with the unaided system. They did not make use of the decision support systems that were employed in the human subject trials. And indeed, from the perspective of developing human performance models, this leads to a most interesting direction for future research.
As we have seen, the decision support
systems took two forms: in the status display, all of the aircraft on the
radar screen that had an action pending were highlighted, while in the
priority display, only the single aircraft with the most urgent pending
action was highlighted. Several variations on these support systems were
also evaluated, but time constraints precluded using them in the experiment.
As discussed in MacMillan et al. (1997), the decision support systems each
had a significant impact on subject performance. The human performance
models should be reexamined and modified to make appropriate use of the
decision support systems. The underlying question is that of just what
the basic functionalities (or limitations) are that the decision aids support
and how they combine to produce improved performance. Taking advantage
of the OMAR capability to interchange human subjects and human performance
models, the experiment trials should be rerun with the new models as subjects.
Several iterations on this process should provide insight into the particular
aspects of human performance that contribute to improvements that the several
decision support systems make possible.
This research was supported by the
Department of the Air Force Contract F33615-91-D-0009.
Corker, K. M. & Smith, B. R. (1993). An architecture and model for cognitive engineering simulation analysis: Application to advanced aviation automation. Proceedings of the AIAA Computing in Aerospace 9 Conference, San Diego, CA.
Cramer, N. L. (1995). MIRAGE: A CLIM-based editor for building gadget-oriented graphical user interfaces. Proceedings of the Association of Lisp Users Meeting and Workshop, Cambridge, MA.
Dennett, D. C. (1991). Consciousness explained. Boston, MA: Little, Brown and Company.
Deutsch, S. E., & Adams, M. J. (1995). The operator-model architecture and its psychological framework. 6th IFAC Symposium on Man-Machine Systems. MIT, Cambridge, MA.
Edelman, G. M. (1987). Neural Darwinism: The theory of neuronal group selection. New York: Basic Books.
Freeman, B. (1997). OMAR User/Programmer Manual, Version 2.0. BBN Report No. 8181. Cambridge, MA: BBN Corporation.
Laird, J. E., Newell, A., & Rosenbloom, P. S. (1987). SOAR: An architecture for general intelligence. Artificial Intelligence, 33, 1-64.
MacMillan, J., Deutsch, S. E., & Young, M. J. (1997). A comparison of alternatives for automated decision support in a multi-tasking environment. Human Factors and Ergonomic Society 41st Annual Meeting, Albuquerque, NM.
Manning, C. R. (1987). Acore: the design of a core actor language and its compiler. Masterís thesis, Massachusetts Institute of Technology. Cambridge, MA: MIT.
Meyer, D. E., & Kieras. D. E. (1997). A computational theory of executive cognitive processes and multiple-task performance: Part 1. Basic mechanisms. Psychological Review, 104, 3-65.
Logan, G. D. (1988). Automaticity, resources, and memory: Theoretical controversies and practical implications. Human Factors, 30, 583-598.
Neumann, O. (1987). Beyond capacity: A functional view of attention. In H. Heuer & A. F. Sanders (Eds.), Perspectives on perception and action. London: Lawrence Erlbaum.
Posner, M. I., Peterson, S. E., Fox, P. T., & Raichle, M. E. (1988). Localization of cognitive operations in the human brain. Science, 240, 1627-1631.
Return to OMAR Index Page