The authors have declared that no competing interests exist.
Conceived and designed the experiments: JL JMG. Performed the experiments: JL. Analyzed the data: JL. Contributed reagents/materials/analysis tools: JL JMG. Wrote the paper: JL JMG.
Maps are a mainstay of visual, somatosensory, and motor coding in many species. However, auditory maps of space have not been reported in the primate brain. Instead, recent studies have suggested that sound location may be encoded via broadly responsive neurons whose firing rates vary roughly proportionately with sound azimuth. Within frontal space, maps and such rate codes involve different response patterns at the level of individual neurons. Maps consist of neurons exhibiting circumscribed receptive fields, whereas rate codes involve open-ended response patterns that peak in the periphery. This coding format discrepancy therefore poses a potential problem for brain regions responsible for representing both visual and auditory information. Here, we investigated the coding of auditory space in the primate superior colliculus(SC), a structure known to contain visual and oculomotor maps for guiding saccades. We report that, for visual stimuli, neurons showed circumscribed receptive fields consistent with a map, but for auditory stimuli, they had open-ended response patterns consistent with a rate or level-of-activity code for location. The discrepant response patterns were not segregated into different neural populations but occurred in the same neurons. We show that a read-out algorithm in which the site and level of SC activity both contribute to the computation of stimulus location is successful at evaluating the discrepant visual and auditory codes, and can account for subtle but systematic differences in the accuracy of auditory compared to visual saccades. This suggests that a given population of neurons can use different codes to support appropriate multimodal behavior.
The superior colliculus is an important model system for the integration of spatial information from different sensory modalities
The aligned-map hypothesis of multisensory integration presupposes that auditory space is indeed encoded via a map. Such a map should contain neurons whose receptive fields tile the entire expanse of space (
When sampling of space must be limited to the oculomotor or visual ranges, maps and rate codes for sound azimuth can be distinguished by evaluating whether neurons exhibit circumscribed receptive fields (A) or open-ended response functions (B). Rate coding neurons might show some degree of non-monotonicity if their underlying tuning functions were not all perfectly aligned with the interaural axis (dotted line).
However, other ways of encoding stimulus position are known to exist, particularly for auditory information in primates. In monkeys and humans, auditory-responsive neurons in areas upstream from the SC do not appear to have such bounded receptive fields distributed across the scene
In this study, we investigated the coding format of auditory responses of rostral SC neurons in detail, to ascertain whether they exhibit a closed-field organization, similar to visually-driven activity in this structure and consistent with the formation of a map of space, or whether they are open-field, like the response patterns of neurons in the auditory areas that serve as inputs to the SC, and therefore potentially consistent with a rate code for sound location.
We assessed the responses of SC neurons (n = 180) in monkeys making eye movements to visual and auditory targets. Target locations spanned a range of +/− 24° with respect to the head from three initial fixation positions (−12, 0, 12°, for a range of +/− 36° with respect to the eyes.) Monkeys (n = 2) performed an overlap saccade task (
Spatial layout of the targets (A) shows that the fixation targets (black dots) were located 12° left, 0° and 12° right at varying elevations depending on the spatial sensitivity of the neuron under study ranged −16 to 6 degree (mean±SD: −4.2±4,1). Targets were either auditory (white noise burst) or visual (LED), presented from a stimulus array of 9 speakers each with an LED attached to its face. The speakers were spaced from 24° left to 24° right with 6° intervals at an elevation of 0° with respect to the animal’s head. B. Events of the overlap saccade task. The baseline period was 500 ms before target onset, the sensory period was 0−500 ms after target onset, and the motor period began 20 ms before saccade onset and ended 20 ms before saccade offset. C. The no-saccade task was similar except that the targets were near or beyond the oculomotor range, and the animal was not required to make an eye movement because the fixation light stayed on.
Overlap saccade task (N = 180) | Visual | Auditory | Both | |||
N | Total (%) | N | Total (%) | N | Total (%) | |
A) sensory response (two sample t-test p<0.05) | 159 | 88.3 | 122 | 67.8 | 113 | 62.8 |
B) motor response (two sample t-test p<0.05) | 162 | 90.0 | 159 | 88.3 | 151 | 83.9 |
C) A and B | 155 | 86.1 | 117 | 65.0 | 111 | 61.7 |
No-saccade task (N = 148) | ||||||
sensory response (two sample t-test p<0.05) | 142 | 96.0 | 111 | 75.0 | 105 | 70.9 |
The time periods in relation to the events of the task are illustrated in
Neurons showed different spatial response properties depending on whether the target was visual or auditory. For visual stimuli, the classic circumscribed receptive field pattern was evident: responses were largest for a particular target eccentricity, but fell off substantially for targets located both more centrally and more peripherally. For example, the neuron shown in
(mean discharge rate +/− standard error with respect to the horizontal eye-centered target location or movement amplitude; S R2 and G R2 refer to the Sigmoidal and Gaussian R2 values). For three out of the four visual responses (upper panels), the fits of Gaussian function are significantly better than those of sigmoidal function (the sensory R2 values for A and B, and the motor R2 value for B; bootstrap analysis, p<0.05). In contrast, for the auditory responses (lower panels), the fit of both functions are about equally good (bootstrap analysis, p > 0.05).
In contrast, for auditory stimuli, responses typically showed an open-ended pattern. As target eccentricity increased, responses either continued to increase, reached a plateau, or showed only a modest dip in activity. The auditory motor responses of the neuron in
A difference is also evident in the “point image” of activity evoked on visual vs auditory trials.
For each neuron, we calculated the activity for a given target location, modality, or response period as a proportion of the peak firing rate observed for any target location, modality, or response period for that neuron. We then calculated the average of this normalized activity across the population of neurons as a function of target modality and target location. This graph plots the average normalized population activity on auditory trials as a percentage of that observed on visual trials. (Only locations in the contralateral hemisphere are shown because visual activity is very low or non-existent for ipsilateral targets, which would make even modest auditory activity appear very large in comparison.) A value of 100 (horizontal dotted line) indicates that the activity for visual and auditory stimuli at the corresponding target location was about equal. As target location becomes more eccentric, the level of activity evoked by auditory stimuli during the motor period approaches and then slightly exceeds that observed for visual stimuli (solid line). A similar increase in auditory activity relative to visual activity with target eccentricity is observed during the sensory period (dashed line), but at an overall lower level.
To quantitatively measure this difference across the population, we reasoned that a circumscribed receptive field should be fit
At the level of individual response patterns, the differences between visual and auditory spatial sensitivity are small, but at the population level they are consistent.
Results for the population of SC neurons (A), with the color and symbol type indicating whether the Gaussian curve fit was significantly superior to that of the sigmoid (bootstrap analysis, p<0.05). Each neuron contributed 3 points to these panels, one for each fixation position. B. Simulation of the expected R2 values of Gaussian and sigmoidal curves if the underlying functions are Gaussian (left) vs sigmoidal (right). Units were simulated as Gaussians or sigmoids of varying parameters with noise, then fits were calculated for each unit and plotted in color indicating the location of the peak (left panel) or inflection point (right panel). The examples illustrate individual units with different peak or inflection point locations. (See:
N | Gaussian and Sigmoid (p<0.05, %) | Gaussian only | Sigmoid only | Neither | Bootstrap Gaussian > Sigmoid (p<0.05,%) | |
Visual sensory | 477 | 83.6 | 10.5 | 0.0 | 5.9 | 34.6 |
Visual motor | 486 | 87.4 | 8.8 | 0.0 | 3.7 | 32.1 |
Auditory sensory | 366 | 54.9 | 10.1 | 1.1 | 33.9 | 4.4 |
Auditory motor | 477 | 83.4 | 9.2 | 0.4 | 6.9 | 10.7 |
Column 1: Each included neuron contributed three fits, one for each eye position. Columns 2−5: The proportion of Gaussian and sigmoidal curve fits that were individually significant. Column 6: The proportion of neuron-eye position combinations for which the observed Gaussian R2 value was significantly greater than 95% of the bootstrapped sigmoidal R2 values generated for that neuron and eye position; corresponds to the proportion of green data points in
To verify that this comparison between Gaussian and sigmoidal fitting can successfully distinguish between such response patterns, we tested the curve fitting procedure on simulated Gaussian and sigmoidal data plus noise (see Materials and Methods). For open-ended response patterns simulated with sigmoids, the sigmoidal and Gaussian curve fits were equally successful (the R2 values are essentially identical and the data lie along the line of slope one,
Could the apparent open-ended auditory response patterns in the actual neurons therefore be an artifact of failing to sample circumscribed auditory receptive fields at sufficiently eccentric locations? Several points argue against this interpretation. First, the visual and auditory targets occupied the same locations, so the sampling of visual and auditory space was identical. If the sampling was insufficient to observe circumscribed auditory receptive fields, it should also have been insufficient for visual receptive fields. Second, the sampling was matched to the range of space where circumscribed receptive fields should have been found if they existed. The targets spanned the portion of the oculomotor range of monkeys that is not normally accompanied by head movements
Nevertheless, we took two additional steps to address this question. First, for the sensory period, we expanded the sampling extent by including some non-saccade trials involving targets near or beyond the limits of the oculomotor range but still within the visual scene (148 neurons, targets: ±30° ±42° and ±60° in addition to original target locations. This corresponds to a range of ±72° in eye-centered coordinates). These trials were included on 15.6±10.4% (mean±SD) of the trials and differed only in that the fixation light stayed on and no saccade was required, which allowed us to investigate the sensory period but not the motor period.
Results in for an example neuron (A) tested out to 72° relative to the eye (ipsilateral fixation, interleaved non-saccade task). B. Population results, format similar to the corresponding panels of
Second, for the motor period, we corrected for any effects that systematic differences in visual vs. auditory saccade accuracy might have introduced to the sampling range. Auditory saccades can show some systematic biases like undershooting or upward shifts
Although the auditory response patterns were generally open-ended, they were not always perfectly monotonic. In some neurons, the responses for the most contralateral target were a little lower than they were for sounds at more intermediate locations (
An example neuron showing a drop-off in responses at the most contralateral target positions (sensory responses shown) (A). We compared the responses at the peak location to the responses at the most contralateral location (black dots) and expressed the result as a Z-score (inset). Data for the ipsilateral fixation was used for this analysis. B. The distribution of Z scores for each modality (grey bars), in comparison to the Z scores expected if the relationship between activity and target location is scrambled (Monte-Carlo simulation, black bars). The dotted lines illustrate the 95% confidence threshold; real Z scores to the right of this point are considered to show statistically significant decrements in activity for more peripheral targets (p<0.05) C. The proportion of neurons showing significant non-monotonicity. D. Same as C, but for targets limited to different cut-off points in our sampling range. The disparity between visual and auditory non-monotonicity is present for all cut-offs, and only with a 36 degree cutoff for sound does the level of non-monotonicity reach that seen for a 12 degree cutoff for visual stimuli.
To determine how often the drop-off exceeded chance variation, we compared the activity at the most contralateral position with the activity at the best location (
These values mean little on their own, but can be compared to the expected distribution of the Z scores under chance. To calculate this distribution, we performed a Monte-Carlo simulation in which the actual target location was scrambled (
How is the location of a target extracted from the pattern of SC activity? The SC controls saccadic eye movements, which are generally similar regardless of target modality
As proof of concept of the first option, we conducted a simulation to determine if a candidate circuit for reading the SC’s visual signals might be able to read auditory signals as well. We employed a model in which the weighted sum of activity in the SC was calculated
A read-out model involving graded weights depending on the location of neurons in the SC (A). The weights were fit based on the motor activity on visual trials, combining all fixations and producing an eye-centered estimate of target location. B. Results of the simulation indicate that the model can successfully calculate target location from the input pattern, regardless of modality. C. Behavioral estimates of target location for visual and auditory trials (data from trials during the recording of the neurons). A slight compression of auditory space relative to visual space is seen both for the model (B) and the actual behavior (C).
The model successfully produced an estimate of visual target location from our recorded SC visual activity that scaled well with target location (Sigmoidal fit, R2 = 0.99,
The modality-dependent difference in the model’s output that does exist is subtle, but it is mirrored in the animals’ behavioral performance. The model produces a slight compression of space for sounds compared to visual stimuli (compare the grey and black lines in
The above simulation does not rule out a modality-dependent adjustment to the read-out algorithm, particularly in concert with other potential read out algorithms (see ref.
Similarity of coding of visual and auditory space has long been assumed to be a prerequisite for integrating information from these two sensory systems. While many previous studies have tested the primate SC’s responses to sounds
Here we have shown that auditory-evoked activity in the SC involves a format different from that of visual-evoked activity in the same population of neurons. This format difference was evident in three different types of analyses: the "point image" of auditory-evoked compared to visually-evoked activity across the population (
The predominantly monotonic response patterns for auditory targets occur even in the motor-related activity, which likely is involved in programming saccades to both visual and auditory targets In our sample, non-monotonicity of auditory response functions was slight although not completely absent. If the SC contained an auditory map of space, neurons with the closed field structure indicative of participating in such a map should be
Although our study focused on the response patterns of individual neurons, the impact of these individual response patterns on the aggregate population response can be visualized as shown schematically in
When a target is visual, a "hill" of activity will be evoked at a location in the SC that corresponds to the visual response field of the neurons. Visual stimuli at different locations would evoke activity at different sites in the SC (A, B, C left panels). In contrast, auditory stimuli at different locations will evoke activity across the SC but with different discharge rates (A, B, C right panels). We note that this schematic does not address the code for the vertical dimension, nor does it consider the possibility that the inflection points of auditory response functions might vary with location in the SC. If the latter is true, then the auditory code would show some topography, with the edge of a broad hill varying with target location.
The disparities we observed between the visual and auditory codes in the primate SC illustrate that bimodal populations of neurons can use different ways of representing different sensory signals. Such differences may account for subtle modality-dependent differences in the behavioral responses guided by such populations.
Exactly how the disparities are resolved as signals progress from the SC to the muscles controlling eye movements is not yet clear. Our modeling suggests that it is in principle possible for the circuit that "reads" the SC to produce a similar answer for a similar target location despite modality-dependent differences in the activity evoked. Further experimental work will be needed to determine if this in fact what happens or if the circuitry intervening between the SC and the extraocular muscles interprets SC activity differently depending on target modality.
Receptive fields that do not fully close on the eccentric side have been reported in the superior colliculus before. Such response patterns have been most extensively characterized in the monkey SC by Munoz and Wurtz
Closed-field receptive fields do not appear to be required for coding locations in frontal, horizontal auditory space. Neurons with open-ended spatial receptive field have been found in many mammalian brain areas including the superior olive
The emerging picture from a variety of species and brain areas suggests that spatial location may often be encoded via “meters” rather than maps. In a meter, the level of activity in a population of broadly-responsive neurons can provide an indication of the location of sounds
Finally, it should be noted that here we have only varied the
Two rhesus monkeys (
Specifically, prior to surgery, animals were pre-anesthetized with ketamine (I.M., 5−20 mg/kg) or ketamine/dexdomator (I.M., 3.0 mg/kg ketamine and 0.075−0.15 mg/kg dexdomitor) and were maintained under general anesthesia with isoflourane (inhalant, 0.5−3.0%). Systemic anti-inflammatory medications (dexamethasone, flunixin, or keterolac) were given as indicated by veterinarian. After surgery, animals were treated with burprenorphine analgesic (I.M., 0.01−0.02 mg/kg doses) for three days.
Animals were housed in a standard macaque colony room in accordance with NIH guidelines on the care and use of animals. Specifically, the animals were housed in Primate Products Apartment Modules (91 cm*104 cm*91 cm), including pair or group housing when compatible partner monkeys were available. Pair and group housed animals exhibited species-typical prosocial behavior such as grooming. Animals also had frequent access to Primate Products Activity Modules (91 cm*104 cm*183 cm), allowing for more exercise including a greater range of vertical motion. All animals had visual and auditory interactions with conspecifics in the room (∼10 animals). Animals were enriched in accordance with the recommendations of the USDA Environmental Enhancement for Nonhuman Primates (1999), and the National Research Council’s Psychological Well-Being of Nonhuman Primates (1998), and the enrichment protocol was approved by the IACUC. Specifically, the animals had access to a variety of toys and puzzles (e.g. Bioserv dumbbells (K3223), Kong toys (K1000), Monkey Shine Mirrors (K3150), Otto Environmental foraging balls (model 300400) and numerous other toys and puzzles). Material from plants such as Bamboo and Manzanita was also placed in the cage to give the animals additional things to climb on and interact with. The temperature in the animal facilities was 20−25 degrees C and the colony room was kept on a 12hr/12hr light/dark cycle. The animals had approximately an hour of audiovisual contact with at least two (and often several) humans per day. The animals’ diets consisted of monkey food (LabDiet 5038 or LabDiet 5045) delivered twice a day, plus daily supplementary foods such as bananas, carrots, mango, pecan nuts, dried prunes, or other treats (typically acquired from local supermarkets or online vendors) to add variety to the animals’ diets. No animals were sacrificed during this study - at the time of the submission of this manuscript both animals that participated in this study are in good general health.
The subjects underwent sterile surgery for the implantation of a head post holder, eye coil and recording chamber
All experimental and behavioral training sessions were conducted in a dimly illuminated sound-attenuation room (IAC, single-walled) lined on the inside with sound-absorbing foam (Sonex PaintedOne). Stimulus presentation and data collection were controlled though Gramakln 2.0 software (Ryklin Software, developed from the laboratory of Dr. Paul Glimcher). Eye position was sampled at 500 Hz. EyeMove (written by Kathy Pearson, from the laboratory of Dr. David Sparks) was used to analyze the eye position traces. Velocity criteria to detect saccade were 20 degrees/s for both saccade onset and offset. All subsequent analysis was performed in Matlab 7.1 (Mathworks software).
Sensory targets were presented from a stimulus array which was 58 inches in front of the monkey. The array contained nine speakers (Audax, Model TXO25V2) with a light-emitting diode (LED) attached to each speaker’s face. The speakers were placed from 24° left to 24° right of the monkey in 6° increments at an elevation of 0° (
The monkey performed an overlap saccade task to auditory and visual targets, in which all conditions were randomly interleaved (
At the start of each recording sessions, a stainless-steel guide tube was manually advanced through the dura. Next, the monkeys performed the overlap saccade task while a tungsten electrode (FHC, impedance between 2 and 4.5 MΩ at 1 kHz) was extended further into the brain with an oil hydraulic pulse micropositioner (Narishige-group.com). Extra-cellular neural signals were amplified and action potentials from single neurons were isolated using a PLEXON system (Sort Client software, PLEXON.INC). The time of occurrence of each action potential was stored for off-line analysis.
When a neuron was isolated, we qualitatively determined the elevation of the receptive or movement field while monkeys performed the overlap task. The elevation of the fixation was chosen near that preferred elevation of the neuron on that session. The target modalities (auditory VS visual), the locations of fixation and the locations of targets were randomly varied on a trial by trial basis. Data were collected as long as the neuron was well isolated and the monkey performed the task. On average, we collected 11.16 ± 5.41 (mean °± SD) successful trials per task condition (fixation location x target location x target modality).
We analyzed neural activity by counting action potentials during several time periods. The baseline period was the 500 ms before target onset, and the sensory target period was the 500 ms period after target onset (
Sigmoidal and Gaussian curve fitting was accomplished as follows. Both curves had the same number of free parameters (i.e. 4). The Gaussian equation was
And the sigmoidal equation was
The sigmoidal and Gaussian curve fitting was accomplished using Matlab and the “lsqnonlin” function, which involves an iterative search to minimize the least squares error of the function. We found the optimal curve fits using a variety of initial starting conditions. Each curve fit was also visually inspected for adequacy.
The intermediate and deep layers of the SC provided the bulk of the recorded neurons. Neurons were included for the relevant analysis if they responded significantly during the sensory or motor-related periods on visual or auditory trials compared to baseline (two-tailed paired t test, p<0.05). For this t-test, all target locations were pooled together because this proved to be the most inclusive criterion. The majority of neurons showed significant responses to both visual and auditory targets during the saccade-related bursts (
To assess the locations of the recording sites with respect to the SC’s motor map, we microstimulated using standard techniques after recording at 19 sites in monkey P. These sites were distributed in 5 out of the 6 guide tube locations which we used for recording from monkey P. During each microstimulation session, the monkey performed a task involving fixating one of three initial eye positions (−12, 0 and 12 degree from the center). The vertical position of the fixation LEDs was the same as in the immediately prior recording session. After fixating for 150ms, the fixation LED was turned off and a stimulation train was applied for 150ms. To allow the subject to earn a reward unconnected to any evoked saccade, a visual saccade target was presented 300ms after the stimulation (20° above fixation and −12, 0 and 12° to the left or right). For comparison purposes, catch trials, identical but without stimulation, were presented 20% of the time.
The circumscribed receptive fields shown in
Each neuron was included in the model twice, as originally sampled and mirror flipped as if it were recorded in the SC on the opposite side. A training set to establish weights was created from
where S is the amplitude of the saccade, and wi and ai are the synaptic weight and motor activity level of the
(EPS)
(EPS)
(EPS)
(EPS)
(EPS)
(EPS)
(EPS)
The authors wish to thank Jessi Cruger and Tom Heil for technical assistance and Prof. Rich Mooney and Dr. Valeria Caruso for valuable discussion.