The Commingled Division of Visual Attention

Yuechuan Sun; Sijing Wu; Ian Spence

doi:10.1371/journal.pone.0130611

Abstract

Many critical activities require visual attention to be distributed simultaneously among distinct tasks where the attended foci are not spatially separated. In our two experiments, participants performed a large number of trials where both a primary task (enumeration of spots) and a secondary task (reporting the presence/absence or identity of a distinctive shape) required the division of visual attention. The spots and the shape were commingled spatially and the shape appeared unpredictably on a relatively small fraction of the trials. The secondary task stimulus (the shape) was reported in inverse proportion to the attentional load imposed by the primary task (enumeration of spots). When the shape did appear, performance on the primary task (enumeration) suffered relative to when the shape was absent; both speed and accuracy were compromised. When the secondary task required identification in addition to detection, reaction times increased by about 200 percent. These results are broadly compatible with biased competition models of perceptual processing. An important area of application, where the commingled division of visual attention is required, is the augmented reality head-up display (AR-HUD). This innovation has the potential to make operating vehicles safer but our data suggest that there are significant concerns regarding driver distraction.

Citation: Sun Y, Wu S, Spence I (2015) The Commingled Division of Visual Attention. PLoS ONE 10(6): e0130611. https://doi.org/10.1371/journal.pone.0130611

Academic Editor: Suliann Ben Hamed, Centre de Neuroscience Cognitive, FRANCE

Received: January 7, 2015; Accepted: May 21, 2015; Published: June 15, 2015

Copyright: © 2015 Sun et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited

Data Availability: "Data underlying experiments 1 and 2 are held at ResearchGate and are publicly available at the following DOIs: https://doi.org/DOI:10.13140/RG.2.1.2251.9846 https://doi.org/DOI:10.13140/RG.2.1.1727.6969"

Funding: Natural Science and Engineering Research Council of Canada Discovery Grant #8351, http://www.nserc-crsng.gc.ca/, IS. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Divided attention is the simultaneous allocation of attentional resources to two (or more) tasks. The division may cross modalities, as when speaking on a mobile phone while driving a vehicle [1–3]. Or, the division may occur within the same modality, with attention directed to different locations that are spatially separated [4–6]. Such partitioning of attention almost always exacts a penalty; none of the tasks is performed as well as when each is done on its own. Behavioral and neuroimaging data [7,8] show that accuracy and latency suffer when two stimulus sets must be attended simultaneously. This dual-task interference is well known and can have serious consequence in real world situations. For example—as we discuss later—a driver concentrating on the road and other traffic while simultaneously attending to information presented on an augmented reality head-up display (AR-HUD) will likely experience reciprocal interference that could have a negative impact on safety [9].

Divided/Partitioned Visual Attention

Whether visual attention can be divided and simultaneously allocated to spatially separated locations has long been the subject of experiment, theory, and controversy [4–6,10–13]. Most investigations and theoretical perspectives have examined the case of a single task where the attentional resource must be distributed over two (or more) spatially separated locations. However, visual attention can be divided without involving non-contiguous locations and can involve distinctly different tasks that require attention. When this happens, the visual attentional processes—though clearly distinct—are spatially commingled rather than separated.

This commingled division of visual attention has been little studied. While some explorations of visual attention have used commingled stimuli, this aspect has not generally been their prime focus. For example, in studies of inattentional blindness (IB) [14], the stimuli are often commingled; however, the emphasis has not been squarely on how the attentional resource is distributed between the primary and secondary tasks but rather on the factors that promote the so-called “blindness”. Another instance involves attentional boost experiments [15,16]: in a typical paradigm, participants must memorize a series of briefly presented scenes while watching for a target in a random sequence composed of targets (e.g. white squares) and distractors (e.g. black squares) centered on each scene. Memory for scenes that accompany the targets is generally found to be better than for scenes accompanying the distractors. This attentional boost is assumed to result from a generalized processing enhancement associated with the appearance of the target. A further example of commingled sets of stimuli comes from studies of the role of attention in the subitizing and counting ranges during the enumeration of multiple objects. Vetter [17] differentiated to-be-enumerated sets of spots by using both filled and unfilled spots; participants were cued to enumerate one or the other set, or the whole set. Additionally, Vetter and colleagues [18] used a set of targets and distractors commingled with a central stimulus that was required for a primary task.

Studies of inattentional blindness, the attentional boost effect, and the role of attention during enumeration can provide some insight into how visual attention is allocated under dual-task conditions with sets of stimuli that are not spatially separated. However, each of these areas of investigation had a different principal motivation from ours and none was specifically designed to address the more general question of how attention is divided and deployed during the execution of different visual tasks with distinct stimuli that are spatially commingled. There may be other experimental paradigms that have required the commingled division of visual attention but there seems to have been no systematic study of the general phenomenon.

Existing theories of attention that assume limited attentional and perceptual resources may be compatible with situations where visual attention must be allocated to distinct tasks where the stimuli are spatially commingled. According to load theory [19–21], perceptual processing is limited in capacity but operates in an involuntary manner on all the information within that capacity. A demanding primary task may exhaust available capacity, leaving little or no ability to handle sensory inputs unrelated to the task. However, when the load imposed by the attended task is low, surplus processing resources will be involuntarily allocated to processing other sensory inputs. Thus, load theory would predict that focusing attention on a demanding primary task will impair perception of a spatially commingled but unrelated secondary stimulus. Indeed, with a sufficiently high perceptual load in the primary task, processing the primary stimulus will largely exhaust available capacity, even though the secondary stimulus may share the attended location. Thus, processing of the secondary stimulus will be severely impaired. With a lower primary load, any residual processing capacity will automatically be available for processing secondary stimuli.

An alternative to a load model is a biased competition model [22,23]. This class of models differs from load models principally in how the tasks are prioritized. The allocation of attention is determined based on the results of a competition between the tasks. This task rivalry is modulated by both top-down and bottom-up influences and the primary task does not necessarily receive exclusive priority, as it does in a load model. Thus, the nature and magnitude of the reciprocal accuracy and latency effects between two distinct tasks can help decide whether a load model or biased competition model provides a better account of the division of attention with spatially commingled stimuli. Load models that assume sequential processing steps (allocation of attention to the primary task before involuntary processing of secondary stimuli—subject to remaining capacity constraints), predict that the secondary task will be poorly performed under high primary task loads. However, the primary task will be relatively unaffected by the secondary task, since it has first claim on attentional and perceptual resources. Degradation of primary task performance is expected only under very heavy loads. In short, load models predict little or no effect of the secondary task on primary task performance.

A somewhat different pattern is expected by a biased competition model [22,23] where sensory inputs compete for processing resources and cortical representation. The competition will be most intense among stimuli that are spatially commingled and hence have proximate cortical representations. The competition for processing and representation will be subject to a variety of top-down and bottom-up influences; thus, expectation and stimulus features will play major roles. For example, in our experiments described below, instructions to the participants will induce a bias favoring the primary task of enumeration. This top-down influence will likely be strengthened by participants’ incremental learning of secondary stimulus characteristics as the experiment proceeds. Similarly, primary and secondary stimulus features like size, shape, color, texture, spatial location, and numerosity will provide bottom-up cues to bias the competition for processing. Importantly, in contrast to a load model, there is no necessary prioritization of the primary task, even if this has been emphasized in the experimental instructions or boosted by a learned response bias, such as probability matching over trials. Thus, if the secondary task is found to impair performance (speed and accuracy) of the primary task, this will tend to favor a biased competition model.

In applied contexts, a better understanding of the fundamental aspects of the commingled division of visual attention may help us to appreciate how new technologies may stretch the capacities of visual attention in ways that could compromise safety. One recent case in point is the augmented reality head-up display (AR-HUD) where computer-generated images are superimposed on the external visual world by projection onto the windshield. Drivers have access to useful information without having to take their eyes off the road. When the driver attends to intermittent graphic warnings on an AR-HUD, while concurrently attending to potential threats beyond the windshield, the attended spatial locations are commingled. Fig 1 shows a hypothetical scenario where an AR-HUD warns of a potential threat and suggests a change in direction to the driver. However, a variety of technical, perceptual, and cognitive challenges could compromise safety and utility [9]. Laboratory investigation of the visual attentional processes involved can provide a valuable perspective. Results from controlled lab trials can establish whether relevant fundamental capacities are likely to be compromised by this variety of divided visual attention.

Download:

Fig 1. Illustration of an AR-HUD.

Using speed, distance, and location, data obtained from remote sensing (cameras, sonar, lidar, GPS, etc.) and map databases, potential threats are evaluated. The slower moving vehicle ahead poses a collision hazard if the red danger zone is entered. Since sensor data confirm no overtaking traffic to the rear, a change in direction is suggested on the AR-HUD. Such augmented reality warnings and suggestions will normally appear infrequently—and often unpredictably—while the driver attends continuously to other tasks that require visual attention (monitoring traffic and road hazards, attending to signage, etc.)

https://doi.org/10.1371/journal.pone.0130611.g001

In two experiments, we had observers repeatedly perform a primary task requiring the deployment of attention over a fairly wide field of view, similar in size to that encountered while driving. At unpredictable intervals, the observer had to perform an additional task that required the detection or identification of a distinctly different stimulus that was spatially commingled with the stimuli of the primary task. The secondary stimulus was initially unexpected, and thus, the first set of trials (only) mirrored the classic inattentional blindness (IB) paradigm. Thereafter, during hundreds of subsequent trials, the secondary stimulus was not completely unexpected, but the probability of its appearance was relatively low.

The Classic IB Paradigm

In the classic IB paradigm [14,24–26], on each of a small number of initial trials, the observer performs a primary task requiring attention (e.g., estimating the number of spots in a display). On a single subsequent inattention trial, an additional stimulus (e.g. a square) unexpectedly appears. Observers often fail to report the novelty even though it is in plain view. The phenomenon was probably first documented by Jevons [27] and has since been studied by several authors [28–30]. But it was Mack and Rock who coined the memorable term inattentional blindness (IB) to describe this failure [14]. However, their charming label is somewhat misleading since not all observers fail to notice the unexpected stimulus.

The Iterated Paradigm

The classic IB design cannot shed light on the interplay between expectation and the attentional loads imposed by the primary and secondary tasks. Hence, our experiments extended the basic paradigm by repeating—at unpredictable intervals—the appearance of the secondary stimulus, over the course of hundreds of trials, while simultaneously varying the attentional load imposed by the primary task. Thus, our interest was not in the initial classic IB trial, but rather in its repetition when the observer was aware that an additional stimulus might appear during performance of the primary task. The repetition allows assessment of performance under varying levels of load on the visual attentional system and allows the reciprocal effects of the primary and secondary tasks to be quantified. The iterated primary task may be considered to be an experimental analog of the normal allocation of visual attention to road and traffic while driving. The secondary task, where a target shape appears with relatively low probability, is analogous to the appearance of a visual warning in an augmented reality display.

Our first experiment examined the observer’s ability to detect the presence of a partially expected secondary stimulus while simultaneously performing a primary task that required varying amounts of attention. Our second experiment required identification of the secondary stimulus, thus ensuring that the stimulus had been clearly seen and was available for further perceptual/cognitive processing.

Human Participant Research

This research involving human participants was approved by the Office of Research Ethics, Office of the Vice-President, Research and Innovation, University of Toronto (Protocol Reference # 22139) and was conducted according to the principles expressed in the Declaration of Helsinki. Informed consent, both written and oral, was obtained from each participant for both experiments in the study.

Experiment 1

The primary task was to estimate how many black spots (1, 2, 3, 4, 5, 7, or 8) had been briefly presented on a computer display. Enumeration of sets of objects has long been known to require the allocation of visual spatial attention [27,31], and even with fewer than four objects—the so-called subitizing range—attention is required [17,18,32]. The secondary task required detection of a single outline square that was spatially commingled with the spots. This distinct shape was unexpected on the first trial but was at least partially expected thereafter.

Materials and Methods

Participants.

Undergraduate students (23 males, 22 females, mean age 19.0 years) at the University of Toronto participated for course credit. They were recruited from an introductory psychology class and were naïve as to the real purpose of the experiment. In order to encourage them to concentrate on the primary task, participants were told that the three best performers in reporting the number of spots (accuracy and latency, weighted equally) would win $50, $30, and $20 respectively. Informed written consent was obtained from all participants.

Stimuli.

Stimuli were displayed on a 21-inch Viewsonic Professional Series CRT monitor using E-Prime 2.0 [33] on a gray rectangular background (RGB: 124,124,124; approximately 36° x 27°, width x height). The stimuli for the primary (enumeration) task were filled black circular spots (RGB fill: 0,0,0; each approximately 1° in diameter; 1, 2, 3, 4, 5, 7, or 8 in number; note that 6 spots never appeared) and the unexpected object was a black-outlined shape (a square subtending approximately 2°; RGB outline: 0,0,0; RGB fill:124,124,124). The unexpected shape was fully visible and clearly distinct from the spots, in size, in shape, and in fill. The positions of the spots were distributed randomly within the 36° x 27° rectangular area and no pair was permitted to be closer than 2°. The shape also appeared at a random location within the 36° x 27° rectangular area, no closer than 2° to any of the spots.

Viewing distance was controlled by the use of a combined chin and forehead rest (University of Houston, College of Optometry, HeadSpot). The center of the screen was positioned in the mid-sagittal plane at eye height at a distance of 30 cm. Participants responded using the numeric keypad on a computer keyboard.

Procedure.

Participants were asked to report the number of spots (primary task: enumeration) on each trial and to respond as quickly and accurately as possible when prompted. Each trial began with a centrally positioned cross (1° x 1°; 500 ms) to establish fixation, followed by a display (36° x 27°; 125 ms) containing a number of spots, (1, 2, 3, 4, 5, 7, or 8), randomly chosen with equal probability; in addition, a secondary stimulus appeared on some trials. The exposure time (100 ms) was short enough to preclude an eye movement. The secondary stimulus was an unfilled black-outlined square. The display was then masked (500 ms) using a randomly pixelated screen (Fig 2).

Download:

Fig 2. Sample trial sequence with two tasks.

Enumeration (primary task) and reporting a shape appearing in addition to the spots (secondary task). Illustrative responses are shown by shading the chosen key on the numeric keypad. In addition to 7 practice and 7 initial trials, there was one classic IB trial (with an unexpected shape); 486 shape-absent/shape-present trials, distributed at random; and 9 full-attention trials. The number of spots (1, 2, 3, 4, 5, 7, or 8) varied randomly with equal probability.

https://doi.org/10.1371/journal.pone.0130611.g002

Classic IB Trial.

Participants completed seven practice enumeration trials to familiarize themselves with the response keys and the experimental procedures. The practice trials preceded and were identical to seven initial trials, which followed immediately. After each initial trial, participants saw the prompt: “How many spots?” On the 8th trial, in addition to the spots, the shape appeared. After completing the enumeration, the participants were prompted again: “Did you see anything else? Press ‘/’ for Yes and ‘z’ for “No”. The numbers of spots appearing with the shape on trial 8 was either 2, 5, or 8, distributed randomly over the 45 participants, such that each of three subsets of 15 participants experienced only one of the three primary attentional loads (2, 5, or 8 spots). Participants did not know that anything extra or unusual might appear on the eighth experimental trial—the inattention trial—and thus, the appearance of the shape was completely unexpected. Up to this point, the experiment followed the classic inattentional blindness paradigm [14].

Iterated Trials.

The next 486 trials—the iterated trials—were grouped in 3 blocks of 162 trials. Participants were allowed a short self-paced rest between blocks. On each trial: (1) the number of spots was random with equal probability from the set {1, 2, 3, 4, 5, 7, 8}; (2) the shape could appear with probability.075; and (3) as in the classic IB trial, participants responded to an additional prompt (“Did you see anything else? Press ‘/’ for Yes and ‘z’ for “No”) after each enumeration. The number of shape-absent displays intervening between shape-present displays could be quite large since the distribution of the number of trials between random equiprobable appearances of the shape is geometric with expectation μ = 12.3 and standard deviation σ = 12.8 [34]. While the shape could conceivably appear on successive trials, a gap of 20 or more trials before its reappearance was also not highly unusual.

Full-attention Trials.

On the next nine trials, participants were instructed not to report the number of spots but only to say whether the shape, which appeared on all of these trials, was present. The number of spots was 2, 5, or 8, chosen at random. These trials—the full-attention trials—are standard in classic IB experiments and verify that the shape was fully visible when attention was not required for the primary task of enumeration.

Results

Classic IB Trial.

The classic IB paradigm is based on a single trial only and is not of major interest here. The proportions of correct detection and a chi-squared test for equivalence of proportions are included mainly for comparison with the existing IB literature. Reporting rates for the three primary task loads in the naïve condition were significantly different, X²₍₂₎ = 6.14, p<.05: with 2 spots to enumerate, 7 of 15 participants (47%) reported the shape; with 5 spots, 4 of 15 participants (27%) were successful; and with 8 spots, only 1 of the remaining 15 participants (7%) reported the shape.

Iterated Trials.

For the iterated trials, we computed the proportion of correct responses in each experimental treatment combination. These proportions were transformed using a variance-stabilizing square root-arcsine transformation before conventional repeated measures analysis of variance. Although the statistical tests were calculated on the transformed data, the reported means and graphs use the original proportions for simplicity of interpretation.

Primary Task (Enumeration).

Accuracy declined as the number of spots increased (Fig 3) on both shape-present and shape-absent trials, F(6, 264) = 130.70, p< .0001. The anticipated drop off in accuracy (Fig 3) and increase in latencies (Fig 4) beyond the subitizing range of 1–3 spots were observed. On average, both accuracy and latency were better on shape-absent trials than on shape-present trials; 76.4% vs. 72.3%, F(1,44) = 329.97, p<.0001 (Fig 3); 865 ms vs. 1293 ms, F(1,44) = 196.67, p<.0001 (Fig 4). There were small but significant interactions between the type of trial and the number of spots: for accuracy: F(6,264) = 5.73, p<.0001 (linear x linear component: F(1,264) = 20.69, p<.0001; quadratic x quadratic component: F = 5.88, p<.05) and for latency: F(6,264) = 3.00, p<.01 (linear x linear component: F(1,264) = 6.73, p<.0001; quadratic x quadratic component: F = 4.68, p<.05). Figs 3 and 4 show that the accuracies and latencies diverge slightly with an increasing number of spots.

Download:

Fig 3. Experiment 1: Accuracies in the primary (enumeration) and secondary (shape-detection) tasks.

1. Open Squares: proportion correct negative responses (1- false alarm rate) when the shape was absent; 2. Filled Squares: proportion correct positive responses (hit rate) when the shape was present; 3. Open Circles: proportion correct enumerations when the shape was absent; and 4. Filled Circles: proportion correct enumerations when the shape was present. The error bars show 1 s.e.m.; if no bars are visible, the 2 s.e.m. range centered on the mean is smaller than the height of the marker. Means associated with open symbols are each based on approximately 2761 data points. Means associated with closed symbols are each based on approximately 224 data points.

https://doi.org/10.1371/journal.pone.0130611.g003

Download:

Fig 4. Experiment 1: Latencies in the primary (enumeration) and secondary (shape-detection) tasks.

1. Open Squares: mean reaction times when the shape was absent; 2. Filled Squares: mean reaction times when the shape was present; 3. Open Circles: mean reaction times for enumeration when the shape was absent; and 4. Closed Circles: mean reaction times for the enumerations of the number of spots when the shape was present. The error bars show 1 s.e.m.; if no bars are visible, the 2 s.e.m. range centered on the mean is smaller than the height of the marker. Means associated with open symbols are each based on approximately 2761 data points. Means associated with closed symbols are each based on approximately 224 data points.

https://doi.org/10.1371/journal.pone.0130611.g004

Secondary Task (Shape-absent vs. Shape-present Trials).

Participants responded correctly on 98% of the shape-absent trials compared to 93% correct responses on the shape-present trials; F(1,44) = 1067.07, p<.0001 (Fig 3). Mean RTs were 541 ms and 907 ms, respectively; F(1,44) = 101.90, p<.0001 (Fig 4). There was a small but just significant interaction between shape-present and shape-absent trials and the number of spots enumerated; F(6, 264) = 2.19, p<.05. Fig 4 reveals a small difference in slopes, with slightly quicker responses for an increasing number of spots when the shape is absent (linear x linear component: F(1,264) = 4.61, p<.01).

Full-attention Trials.

Reporting accuracy was 99.8%. The shape was clearly visible among the spots when participants were not required to enumerate them.

Discussion

On the classic IB trial, when attention was lightly loaded by the primary task, 47% of participants reported the unexpected shape; but when visual attention was more heavily occupied, this percentage dropped to 27% and eventually to only 7%. This replicates previous research on inattentional blindness: participants can detect simple, salient geometric objects with low to moderate probability when they are concurrently engaged in a sufficiently demanding attentional task and are naïve to the possibility of the appearance of an additional distinct visual stimulus [14,24–26,35–37]. Our major interest, however, was in the iterated trials where the participants were aware that a shape might appear in addition to the spots that they had to enumerate.

When the shape was absent, there were virtually no false alarms, independent of load on the primary task (Fig 3), suggesting that very little in the way of attentional resources was required to confirm that the secondary stimulus had not appeared. However, when the primary task imposed a light load (1–4 spots) and the shape appeared, it was missed on about 1 trial in 15 on average. With a heavier primary load (5–8 spots), the miss rate increased to around 1 in 10 trials. This suggests that if attention is increasingly occupied by the primary task, participants will have more difficulty in detecting the secondary task stimulus. Nonetheless, even with heavy primary task loads, the shape was detected with fairly high probability. The latency to respond (in the secondary task) was 907 ms when the shape was present and 541 ms when it was absent (see Fig 4). These results tend to favor a biased competition account of the division of attentional resources over a load model since it appears that the primary task was not sufficiently prioritized over the secondary task to severely impair processing of the latter. The increase in RT when the shape was present is likely a consequence of the competition for attention and representation between the two commingled stimuli. It simply takes longer to divide and allocate attention when both stimuli are competing for resources. However, the faster response on shape-absent trials may—in part—be the result of a bias to respond negatively, due to the preponderance of shape-absent trials.

With no shape present, the primary task accuracy decreased from just over 90% (1 spot) to about 40% (8 spots) with an elbow at 3 spots (Fig 3). This confirms that increasing the task load in enumeration has a negative effect on performance and is also consistent with the well-known difference between subitizing and counting. With the shape present, primary task accuracy was little affected with light primary task loads (1–4 spots), but for heavier loads (5–8 spots), the primary task accuracy was about 7% worse on average (see Fig 3 and the significant linear x linear component of the interaction). Costs are incurred with the deployment of visual attention to detect an intruding shape. When the primary task is easy, the secondary task is accommodated without discernable cost, but as the primary task becomes more demanding, both tasks suffer reciprocal interference, suggesting a competition for attentional resources rather than a prioritized allocation, with the primary task dominant.

In many applied situations—such as driving a vehicle—it would not be sufficient to merely detect the appearance of a relatively unexpected secondary stimulus. Identification is likely to be required. Attention must be deployed and maintained for the observer to be able to select the response from a set of (partially) expected options. For example, in an automotive augmented reality display, it would be important to distinguish images warning of (1) a forward collision from (2) a lane excursion or (3) an imminent turn advisory. In AR displays, visualizations of threats that are not clearly distinguishable are likely to be more dangerous than no warning at all. Thus, experiment 2 included the additional task of identification.

Experiment 2

As before, the primary task was enumeration of a number of black spots. The secondary task outline shape was a triangle, a square, or a diamond. Each of these three variants appeared an equal number of times during the iterated trials. The participants had to detect and identify the shape that had appeared, thus ensuring that it had been clearly seen.