Multi-disciplinary Foundations for Multiple-task Human Performance Modeling in OMAR

Stephen Deutsch
(sdeutsch@bbn.com)
BBN Technologies
10 Moulton Street
Cambridge, MA 02138
 


Abstract

The design of procedures for the human operators of complex systems can be explored through simulation if human performance models of sufficient fidelity can be developed. The Operator Model Architecture (OMAR) was created to provide a computational framework to address the development of multiple-task models. A multi-disciplinary foundation that reached beyond the experimental psychology and artificial intelligence literatures was considered essential to the construction of successful models. Brain imaging and clinical studies suggest that tasks are assembled as the coordinated execution of function-specific perceptual, cognitive and motor capabilities. These studies together, with philosophically grounded cautions, further suggest that the mediation of task contention be accomplished in a framework that avoids a homunculus-based executive. OMAR provides a computational framework for building models sensitive to these considerations. Examples from a commercial air traffic control environment are used to illustrate OMAR modeling capabilities.

    1. Introduction

    The Operator Model Architecture (OMAR) provides a simulation environment in which to model human operators, the workplaces at which they operate and the entities of the larger world that are reflected in their workplaces. An important goal has been to provide human performance models with sufficient fidelity to usefully explore and develop operating procedures for complex environments. Much of the research has focused on the commercial air traffic control environment with aircrews and air traffic controllers as the principal players. Each of the players typically has several tasks in process and interruptions are commonplace. To address the fidelity requirement, the OMAR operator models must exhibit reasonable multiple-task behaviors.

    The modeling of multiple-task behaviors has been explored extensively in EPIC (Meyer & Kieras, 1997), and SOAR (Newell, 1990; Laird, Newell, & Rosenbloom, 1987) has also been adapted to model multiple-task behaviors. In particular, Meyer and Kieras (1997) report considerable success in developing a production rule-based model of the psychological refractory period (PRP) procedure. The basic components of their model are a cognitive processor comprised a production rule interpreter with inputs from long-term and production memory, and a working memory, with auditory and visual processor inputs that interact with the production rule interpreter. The model relies heavily on a centralized, synchronous production rule framework. A production rule-based executive process administers the task scheduling strategy for regulating competing task execution. The implementation is just one of a theoretically infinite number of computational frameworks that might give rise to the desired human-like multiple-task behaviors. In building the OMAR framework, particular attention has been paid to developing multiple-task behaviors from an assembly of concurrently operating functional centers absent the executive or central controller.

    The motivation for this approach to human performance modeling, derived from a selective a reading in several disciplines, is outlined in the Section 2. Section 3 provides background on the aircrew/ATC domain and describes implementation of the models of aircrew in-person conversations and their interruptions by ATC directives. Section 4 provides a description of the computational elements for constructing OMAR human performance models.
     
     

    2. Multi-disciplinary Foundations for Modeling Multi-tasking in OMAR

    The process of building a human performance model capable of emulating the operators of a complex system is a somewhat speculative endeavor at best. Drawing on the research from a number of disciplines, a modest goal of this particular undertaking has been to put in place a more brain-like distributed processing framework from which to explore some of the human performance issues, related principally to multiple-task behaviors, that impact the operation of complex systems. The modeling environment developed is symbolic rather than connectionist, but does not preclude the inclusion of connectionist components.

    Over the years, experimental psychologists have conducted extensive experiments providing a wealth of interpreted data, philosophical discussion dates back through the millennia, and more recently, cognitive neuroscience and clinical studies have provided electroencephalography (EEG), magnetoencephalography (MEG), positron emission tomography (PET) and functional magnetic resonance (fMRI) images of the brain at work (Posner, 1993; Raichle, 1994) identifying the locus of specific perceptual, cognitive and motor functionalities. There is a vast literature, but no road map for modeling human performance. The computational architecture for models developed in the OMAR framework differs from that of EPIC and SOAR in fundamental ways: (1) stimuli impinge directly on, activate, and propagate through long term procedural memory—the knowledge of how to do things (see Figure 1); (2) tasks, skilled cognitively-driven behaviors, are accomplished through the coordinated actions of function-specific procedures representing the contributions of specific brain areas; (3) to the extent that the resulting behaviors may be considered intelligent, that intelligence is the product of the pattern matching implicit in the changing sensitivities of the network of procedures as stimuli evoke responses at network nodes; (4) task contention outcome, rather than being determined by a central executive, is mediated on a pair-wise basis among contending tasks. The foundations for these choices are briefly in this section.
     


    The Omar Cognitive Model
    Figure 1 The OMAR Cognitive Agent


     


    The now routine accounts of the early PET studies and the more recent fMRI studies portray the execution of each experiment as being the product of a small number of brain centers—small areas of activity at widely dispersed major brain centers. Posner, Peterson, Fox, and Raichle (1988) draw on the evidence of a series of their PET experiments to suggest that "the mental operations that form the basis of cognitive analysis are localized in the human brain." To further support their assertion of the localization of cognitive function, they cite studies of patients with lesions and their related deficits. Based on these studies, the basic architectural framework seems reasonable well established. Tasks, made up of perceptual, cognitive and motor components, appear to be accomplished through the collective actions of small specialized areas of activity that take place in each of several widely dispersed brain centers.

    On a closely related but more conjectural plane, Edelman (1987) discusses the psychological functions of "development, perception (in particular, perceptual categorization), memory, and learning" and how they relate to the brain. Edelman (1989) extends his analysis to consider "perceptual experience—the interaction of memory with the present awareness of the individual animal," that is, perceptual awareness and conscious experience. He describes neural maps as the ordered arrangement and activity of large groups of neurons as distinct from single-neuron connections. They are highly and individually variant in their intrinsic connectivity. Changes in the behavior of the network are the result of changes within particular populations of synapses. "These structures provide the basis for the formation of large numbers of degenerate neuronal groups in different repertoires linked in ways that permit reentrant signaling" (Edelman, 1987, p. 240) where, in degenerate systems, functional elements in a repertoire may perform more than one function and a function may be performed by more than one element (Edelman, 1987, p. 57). Reentry is a basic mechanism suitable for synchronizing the neuronal activity across the mappings at diverse hierarchical levels. Global mappings have a dynamic structure that reaches across reentrant local maps and unmapped regions of the brain to account for the flow from perception to action. Motor activity, an essential input to perceptual categorization, closes the loop.

    Taken together, Posner et al. and Edelman present a picture of the execution of a task as the coordinated activities of small, specialized local sites operating at several remotely located brain centers. In Edelman’s terms, reentrant signals link the components within the local sites, while global mappings connect the activities of the broadly dispersed major centers. The OMAR models attempt to emulate this basic computational framework. That the smallest operating units are large groups of neurons is taken as license to build the models at a symbolic level.

    Edelman, referencing Bartlett (1932), goes on to present a view of memory as process. For him, memory is the "ability to categorize or generalize associatively" (Edelman’s italics, 1987, p. 241). Categorization occurs at the level of a global map and is degenerate. Edelman is well aware of the distinctions between declarative and procedural memory, but he is also quick to point out that these distinctions may be less than generally assumed. He suggests that there may be a procedural base supporting declarative memory.

    In Edelman’s view of memory as process, perception, categorization, generalization, and memory are closely linked. "Memory is a form of recategorization based upon current input; as such, it is transformational rather than replicative" (Edelman, 1987, p. 265). Memory is an active process of classification leading to recategorization and, thus, a partitioning of the world that is presented as one "without labels." Storage, to the extent that it exists, is one of procedures for mapping inputs to responses; hence, full representations of objects are neither stored nor required: "It is the complex of capacities to carry out a particular set of procedures (or acts) leading to recategorization that is recollected" (Edelman’s italics 1987, p. 267). This view contrasts sharply with memory cast as data residing in a data base, where content is passive, references are made to it, it may fade with time, and in the case of short-term memory, new memories may reinforce or replace existing memories. In such schemes, something operates on memory as data, reinforcing some of it and degrading other parts of it. In the models developed, memory is an integral part of the processes that employ it.

    Production rules have played a central role in cognitive modeling systems (e.g., EPIC and SOAR). As a computational tool they are convenient and have been exploited to build some simple models of human learning (Laird, Rosenbloom, & Newell, 1986). On the other hand they have all the trappings of an executive—in their conditions they may have oversight of one or more active tasks and memory stores, while in their actions they may initiate, interrupt or terminate tasks and execute operations on memory or other capabilities central to the functioning of a model. The existence of such a brain center is certainly open to question. Neither the clinical research that has extended over many years, nor the more recent PET and fMRI imaging studies have identified a potential location for such an omnipotent brain function. Dennett (1991) expresses considerable concern over such homuncular theories. Centering his discussion around the metaphor of the Cartesian Theater where everything comes together, he suggests that theater provides catchall for awkward elements leading to the failure to address difficult underlying questions. Dennett offers a Multiple Drafts model of consciousness in which "all varieties of perception—indeed all varieties of thought or mental activity—are accomplished in the brain by parallel, multi-track processes of interpretation and elaboration of sensory inputs." He speaks in terms of an on-going process of "editorial revision." Dennett reinforces parallel processing as essential to modeling task execution and reminds us to be firm in our disavowal of homuncular concepts in modeling human performance. Following Dennett’s admonition, the models do not employ an executive or controlling process.
     
     

    3. An Aircrew/Air-Traffic-Control Scenario

    While scenarios in the commercial air traffic control domain can be developed to an arbitrary level of complexity, even the simplest scenarios can make multiple-task demands on the aircrews and air traffic controllers (ATC).

    3.1 Aircrew/Air-Traffic-Control Communication

    Verbal communication, frequently the point of convergence for task contention in the air traffic control environment, takes place in three modes: in-person conversation between aircrew members, party-line radio communication among aircrews and the ATC managing their airspace, and telephone communication between ATCs in adjacent sectors. At the discretion of the aircraft captain, either the captain or the first officer may undertake the task of handling ATC communication. The aircrew member not handling the ATC communication will monitor all ATC communications expect for occasional periods when communication with, for example, a company dispatcher is required. The party-line nature of radio communication means that ATC communication with each aircraft is heard by the aircrews of all aircraft under control of that ATC. Hence, an ATC will identify the designated aircraft call sign as the first segment of an utterance.

    Conversation on the flight deck between the captain and first officer is the more typical person-to-person conversation of everyday life, but it is subject to interruption by ATC communication. The interruptions may take the form of directives addressed to their aircraft or to another aircraft under control of the ATC. In the interests of clarity and efficiency, most of the aircrew/ATC communications are highly stylized exchanges initiated with a directive or a question and completed by an acknowledgment of the directive or a response to the question. Established policy plays an important role in these exchanges. Verbal transactions between aircrew members must be suspended for ATC-initiated communication, even when the communication is directed to another aircraft. The crew must remember to resume the transaction on completion of the ATC interruption. An aircrew member wishing to initiate a communication with an ATC must wait for the completion of an on-going transaction before initiating the communication. Typical directives to an aircraft might involve changes in heading, altitude and airspeed. The crew member handling the communication will acknowledge the communication and monitor the execution of the directive by the other crew member. Policy dictates cross checking—each crew member’s expectations of exactly what the other crew member will do must be confirmed or the exception addressed. The domain is a fertile one in which to examine multi-tasking.

    Figure 2 provides an example of a aircrew conversation interrupted once by ATC directive that they must attend to and then by an ATC directive for another aircraft causing them to further delay the resumption of their conversation. Jim, the captain of flight DAL100 has just initiated a conversation with his first officer Joe, when they are interrupted by a communication from the ATC. Jim acknowledges the ATC directive and Joe, having initiated the flight level change, resumes the in-person conversation, but it is immediately interrupted by another ATC communication, this time directed to Jane and Bill’s flight UAL10. Jim must again pause before once again picking up the interrupted communication with his first officer.
     


    Aircrew Conversation Timeline
    Figure 2 Aircrew Conversation Timeline


     

    3.2 Modeling Contention Between Tasks and its Resolution

    As has been discussed, policy plays an important role in determining aircrew response to communicative acts: in-person communication is deferred in response to the onset of ATC radio communication; cross-checking dictates overlapping responsibilities with ATC communication managed by one crew member, while ATC directives are acted on by the other crew member; expectations must be satisfied and those that are not meet must be called out to secure safe aircraft operation; initiation of a party-line communication must await the completion of ongoing transactions. In SOAR or EPIC, each of these "decision" events might be viewed as the appropriate subject of an executive process and implemented as a rule set. In these tick-based simulation environments, each decision might be revisited numerous times before it is resolved and the concurrent nature of the ongoing tasks might dictate that several separate rule sets be evaluated at each tick.

    The OMAR simulator is an event-based simulator to accommodate the particular and varied time steps at which each of several concurrent processes can be expected to operate. An aircrew member may initiate the action required by the change-altitude portion of an ATC directive (perhaps by setting the new altitude on the mode control panel (MCP)), while continuing to attend to subsequent speed and heading directives. These activities go on concurrently, each implemented as task with appropriate time frames. Established policy dictates that an in-person aircrew conversation be deferred at the onset of an ATC communication. In OMAR, rather than being the subject of a rule-based decision, established policy-driven behaviors are viewed as a cognitive form of automaticity (Logan, 1988). The priority of the aircrew "listen to the ATC" task is simply higher than the aircrew "in-person conversation" task. The onset of "listen to the ATC" task interrupts the "in-person conversation" task based on its priority. In like manner, the aircrew "listen to other ATC transaction" has higher priority than "initiate ATC communication." An aircrew member will wait for the completion of an on-going party-line transaction to complete before initiating a new transaction. Policy-based decisions are viewed, not as the product of a centralized executive process (for which there is little evidence), but rather as the outcome of contention among the particular subset of tasks competing to execute in response to events initiated externally or internally. Events, be they externally or internally initiated, impinge, not on short-term memory, but on activated long term memories in the form of schemata with well established policy-based priorities. In acting on an the initial directive of an ATC directive while attending to subsequent directives there may be no contention, but when contention is present, as in initiating a party-line communication, policy-based priorities mediate action. Given that several dispersed functional components may contribute to each of the contending tasks, when the contention is resolved, the component functions must act in accordance with the resolution.

    3.3 Modeling Three Functional Areas in Listening

    To elaborate on the OMAR implementation, it is necessary to examine how tasks are constructed from goals, and their plans and procedures (see Figure 1), and how competition between tasks is mediated. The aircrew members each have distinct goals to manage in-person (handle-voice-communication) and radio communication (handle-atc-communication or manage-atc-communication depending on whether the crew member is responsible for or simply monitoring ATC communication). Each of these goals is implemented as a plan made up of subgoals and procedures. The goals and subgoals express the proactive agenda of the agent for addressing anticipated contingencies, while the procedures express the actions to be taken to accomplish each goal. These particular goals are distinct to the extent that the protocols for conducting in-person and radio communication are distinct. The communication goals are activated with the procedures listen-for-voice-message and listen-for-radio-message in a wait-state. The goals express the cognitive capability to conduct an in-person or radio conversation using the appropriate protocol for each communication media. They are in a wait-state pending the onset of a communication in their particular media. These goals, subgoals and procedures form the attended cognitive component of the "listening" complex of tasks. Additional goals and procedures stand ready to address the requirements of the content of the message, for example, setting the new target altitude using MCP.

    As currently implemented, the listening tasks themselves have two additional components. The listening task complex is initiated by a verbal message. A separate procedure for processing the auditory input, activated through a different path, awaits the onset of an auditory message. Shortly after the auditory message onset, a speech understanding procedure is invoked to develop the propositional form that the attended cognitive task will operate with. In the simulation, the message is simply conveyed as an object and the auditory and speech understanding processes are just time-consuming stubs. The development of the listening task posits three distinct functional areas of processing. Separate goal and subgoal trees set up each of the functional capabilities. The onset of the auditory message initiates the processing with the activities of the three functional areas coordinated through a series of messages, or signals, as they are termed within OMAR. The functional areas and signals are a specific symbolic analogue of Edelman’s (1987) re-entrant nets. The procedural bias in the modeling approach is taken a step further. Short-term memory (Martin, 1993), rather than being treated as a faculty in its own right, is modeled as a set of distinct capabilities distributed (Schneider & Detweiller, 1987) among a family of functional procedures. The motivation for this approach is once again derived from Edelman’s (1987) process view of memory and reinforced by his references to Bartlett (1932). Auditory memory of a verbal message is a component of the auditory process, while the propositional memory of a message is a component of the language understanding process. Their persistence, clearly different for each modality, is envisioned as, but not yet implemented as, a product of the persistence of their enclosing procedures.

    Given a task, postulated to be the product of contributions from several dispersed functional capabilities, the event of the interruption of that tasks can be explored in many ways. Exactly what occurs in the auditory and language understanding components of the "listening" task, indeed, even that this is a correct functional component allocation is not precisely established. The model makes explicit a proposal for how the component functionalities might be coordinated during task execution and more interestingly, when a task is interrupted.
     
     

    4. OMAR Support for Multi-task Human Performance Modeling

    OMAR’s strengths as a human performance modeling environment lie in its representation languages and their graphical editors and browsers, its simulator and its post-run analysis tools. The principal representation languages are the Simple Frame Language (SFL) and the Simulation Core (SCORE) goal and procedural language. SFL is a direct descendent of the KL-ONE (Brachman & Schmolze, 1985) family of frame languages, while SCORE is a descendent of Actors (Agha, 1986). A rule based language provides the capability to develop rule packets as models of decision making. This section focuses on those aspects of the languages that support the development the models of human multiple-task behaviors.

    4.1 Concurrent Task Execution and Mediating Task Contention

    Language constructs in SCORE provide the basic capability to express goals and their plans that are made up of subgoals and procedures. The concurrency essential to the multiple-task capability in the models is provided by race and join forms in the language. A race form completes when the first of its enclosed forms completes. A join form completes when all of its enclosed forms complete. A spawn form is available to initiate an independent execution thread.

    The contention between procedures is a more complex issue. At least three levels of contention can be envisioned. Thoughtful, attended deliberation can lead to the selection of one course of action over another. The deliberation process that should probably be explicitly modeled is not addressed here. The concern in the current effort has been with the simpler cases of policy-driven decisions as described in the aircrew scenarios above and the still simpler contention based on access to particular, identifiable resources. The contention between tasks can occur high in the goal tree as in the contention between "listen to ATC" and "in-person conversation" or near the leaves of the tree as in contention between tasks for access to the dominant hand for a skilled manual operation. All SCORE procedures are SFL concepts and may be classified as tasks that contend with particular other tasks (as in the case of "listen to ATC" and "in-person conversation") or with other instances of their own class (as in the case of the dominant hand requirement). A new task about to run must either establish that it does not contend with a running task or that if it does, that it has sufficient priority to block the execution of the running task. If a new task has sufficient priority, it begins execution and execution of the contending task is halted until execution of the new task has completed. At this point, barring intervening events effecting these tasks, the original task resumes execution. If the priority of the new task is not sufficient to block the running task, it must wait for the running task to terminate. Tasks priorities are computed dynamically and tasks contention is revisited as priorities change.

    4.2 Pattern Matching in Coordinated Functional Component Execution

    As we have seen in the aircrew scenario, goals are typically employed to set up a network of procedures, each in a wait-state sensitive to particular externally or internally generated events. The events take the form of signals in SCORE. The signals are implemented as lists with the first element of the list defining the signal type and additional elements of the list provided the data required for the signal type. The SCORE form, signal-event, takes a list as an argument and generates the signal. A procedure may enqueue on a signal by using the with-signal form. Execution of the procedure invoking the with-signal form enters a wait-state pending the occurrence of a signal of the designated type. The with-signal form may include a test on any of the elements of the signal that must be satisfied before the signal is accepted for processing. Once a signal is accepted, processing of the enclosing procedure continues. If the signal type is of further interest, the with-signal form must be employed again.

    Signal based coordination of executing procedures bears a strong resemblance to a data flow architecture (Arvind & Culler, 1983), while differing significantly from object-oriented message passing. A procedure issuing a signal continues operation and does not receive any returned values. The issuing procedure has no knowledge of the other procedures that have enqueued on the signal. There may any number of procedures enqueued on it or none at all. A given procedure may enqueue on a signal once or in each of two or parallel threads employed to explore different patterns of events that each include this particular event.

    Signal processing forms the core of the implementation of the functional capabilities that make up the multiple-task model of human performance. Signals are the representation of external events that trigger the model’s human receptors (eyes and ears in the current implementation) and they are the basis for the subsequent internal cascade of events produced in developing the coordinated multiple level response to those external events. The network of activation of with-signal forms changes rapidly over time to reflect the occurrence of external events and the many procedures representing the functional capabilities that combine to form tasks governing the response to those events. The changing network of active procedures, each sensitive to particular external or internal events, forms a pattern matcher that determines the behaviors of the model. The proactive component of the behaviors is provided by the goals and subgoals that govern the initial activation of the procedures. Each of the behaviors in the performance of a task is the result of a mix of the proactive and reactive components of the task. The signal-driven activation of network nodes representing functional capabilities provides an emulation of Edelman’s reentrant maps.
     
     

    5. Future Work

    Simulation studies in the commercial air traffic control domain employing professional aircrews and air traffic controllers have been conducted on a regular basis over the years. Access to the data from these studies would provide the basis for an assessment and further refinement of the modeling described here. In refining the human performance model, two areas ore of particular interest. The first is to further explore the link between process and memory, and in particular, to model the impact of references to a memory instance from multiple procedures and model the persistence of a memory item as a residual of procedure execution. The second area of interest, one where homuncular concepts can easily intrude, is explore basis for the speed up in performance as workload increases.

    Acknowledgments

    The author wishes to thank Michael Young of the USAF Research Laboratories for his continued support for this research effort. The research reported on was conducted under USAF Armstrong Laboratory Contract No. F33615-91-D-0009 and F41624-9T-D-5002.

    References

Agha, G. A. (1986). Actors: A model of concurrent computation in distributed systems. Cambridge, MA: MIT Press.

Arvind & Culler, D. E. (1983). Why dataflow architectures. Computational Structures Group Memo 229-1, Laboratory for Computer Science, Massachusetts Institute of Technology.

Bartlett, F. C. (1932). Remembering: A study in experimental and social psychology. Cambridge, U. K.: Cambridge University Press.

Brachman, R. J., & Schmolze, J. G. (1985). An overview of the KL-ONE knowledge representation system. Cognitive Science, 9, 171-216.

Dennett, D. C. (1991). Consciousness explained. Boston, MA: Little, Brown and Company.

Edelman, G. M. (1987). Neural Darwinism: The theory of neuronal group selection. New York: Basic Books.

Edelman, G. M. (1989). The remembered present: A biological theory of consciousness. New York: Basic Books.

Laird, J. E., Newell, A., & Rosenbloom, P. S. (1987). SOAR: An architecture for general intelligence. Artificial Intelligence, 33, 1-64.

Laird, J. E., Rosenbloom, P. S., & Newell, A. (1986). Chunking in SOAR: The anatomy of a general learning mechanism. Machine Learning, 1, 11-46.

Logan, G. D. (1988). Automaticity, resources, and memory: Theoretical controversies and practical implications. Human Factors, 30, 583-598.

Martin, R. C. (1993). Short-term memory and sentence processing: Evidence from neuropsychology. Memory & Cognition, 21, 176-183.

Meyer, D. E., & Kieras. D. E. (1997). A computational theory of executive cognitive processes and multiple-task performance: Part 1. Basic mechanisms. Psychological Review, 104, 3-65.

Newell, A. (1990). Unified theories of cognition. Cambridge, MA: Harvard University Press.

Posner, M. I. (1993). Seeing the mind. Science, 262, 673-674.

Posner, M. I., Peterson, S. E., Fox, P. T., & Raichle, M. E. (1988). Localization of cognitive operations in the human brain. Science, 240, 1627-1631.

Raichle, M. E. (1994). Visualizing the mind. Scientific American, 270, 58-64.

Schneider, W. & Detweiller, M. (1987). A connectionist/control architecture for working memory. In G. Bower (Ed.), The psychology of learning and motivation: Advances in research and theory (Vol. 21, pp. 54-119). New York: Academic Press.
 
 


Return to OMAR Index Page