(C) PLOS One This story was originally published by PLOS One and is unaltered. . . . . . . . . . . Subpopulations of neurons in the perirhinal cortex enable both modality-specific and modality-invariant recognition of objects [1] ['Heung-Yeol Lim', 'Department Of Brain', 'Cognitive Sciences', 'Seoul National University', 'Seoul', 'Inah Lee'] Date: 2024-07 The perirhinal cortex (PER) supports multimodal object recognition, but how multimodal information of objects is integrated within the PER remains unknown. Here, we recorded single units within the PER while rats performed a PER-dependent multimodal object-recognition task. In this task, audiovisual cues were presented simultaneously (multimodally) or separately (unimodally). We identified 2 types of object-selective neurons in the PER: crossmodal cells, showing constant firing patterns for an object irrespective of its modality, and unimodal cells, showing a preference for a specific modality. Unimodal cells further dissociated unimodal and multimodal versions of the object by modulating their firing rates according to the modality condition. A population-decoding analysis confirmed that the PER could perform both modality-invariant and modality-specific object decoding—the former for recognizing an object as the same in various conditions and the latter for remembering modality-specific experiences of the same object. To test the abovementioned hypothesis, we developed a multimodal object-recognition task for rats employing visual and auditory cues. By requiring a nose-poke during object cue sampling, our task allowed a clear definition of sample and responding phases while observing their neural firing correlates in a temporally controlled manner. Our findings suggest that rats can recognize a familiar object (originally learned multimodally) almost immediately when cued by a unimodal sensory attribute alone (e.g., visual or auditory) without additional learning. However, inactivating the PER resulted in performance deficits in both multimodal and unimodal recognition conditions. Physiologically, we discovered that the selective firing pattern for an object was comparable regardless of the stimulus modality in most PER neurons. However, a significant proportion of neurons also showed a preference for a specific sensory modality during object information processing. A population-decoding analysis revealed that these subpopulations of neurons enabled both modality-specific and modality-invariant recognition of objects. We hypothesized that the PER may support multisensory object recognition by integrating multimodal inputs from an object to form a unified representation of the object. Considering the associative nature of the PER [ 26 – 29 ], the region can be expected to integrate information from multiple sensations rather than processing it separately. Indeed, it has been shown that PER neurons do not represent individual sensory attributes separately in rats performing behavioral tasks using multimodal stimuli [ 30 , 31 ]. However, these studies have only reported neural correlates of behavioral responses or rewards associated with objects rather than actual information about the objects themselves. Accordingly, in the current study, we investigated how multimodal information is integrated to create a unified representation of an object while minimizing the influence of other task-related variables, such as behavioral response or reward outcome. Findings from several studies have implied that the PER is engaged in “multimodal” object recognition. Anatomically, it has been shown that the PER receives inputs from areas that process diverse sensory modalities, including those from visual, auditory, olfactory, and somatosensory cortices [ 21 – 23 ]. In rodents, in particular, these areas are known to send monosynaptic inputs to the PER [ 22 ]. Experimental results further support the involvement of the PER in multimodal object recognition. In human functional magnetic resonance imaging (fMRI) studies in which subjects were presented visual-auditory or visual-tactile features that were either from the same (congruent) or different (incongruent) objects, activity within the PER was found to be greater when the 2 stimuli were congruent [ 24 , 25 ]. The necessity of the PER for multimodal object recognition has also been tested using crossmodal versions of a delayed nonmatch-to-sample task in nonhuman primates [ 4 ] and a spontaneous object-recognition task in rodents [ 5 – 7 ]. In these tasks, in which animals sampled an object using one sensory modality (e.g., tactile), and then were tested for retrieval of object information using an unused sensory modality (e.g., visual), lesioning or inactivating the PER resulted in performance deficits. These results indicate the involvement of the PER in multimodal object recognition, but the mechanisms underlying these functions remain largely unknown. Our brains can effortlessly integrate information from different sensory modalities to form a unified representation of the world [ 1 , 2 ]. This natural ability is also evident during object recognition, as one can quickly identify a music box by visually perceiving its distinctive appearance or hearing its original sound. The ability to recognize objects crossmodally has been reported not only in humans but also in nonhuman primates [ 3 , 4 ], rodents [ 5 – 7 ], dolphins [ 8 ], and even insects [ 9 ]. However, most studies on object recognition have neglected the multisensory nature of this process. Object recognition has been studied primarily using unimodal stimuli, such as visual stimuli [ 10 – 12 ], or including uncontrolled multimodal stimuli, such as 3D “junk” objects [ 13 , 14 ], without a specific goal of investigating multimodal processing. This tendency is also evident in studies of the perirhinal cortex (PER), a region well known to play a critical role in object recognition [ 15 – 20 ]. Results Rats can perform multimodal object-recognition task To test multimodal object recognition while controlling the sampling of the object’s unimodal (i.e., visual and auditory) attributes, we developed a behavioral paradigm for rats that would enable stable, simultaneous sampling of multimodal cues (Fig 1A; see S1 Video). The task tested object recognition by requiring the animals to identify a given object and produce a proper choice response associated with the object. The apparatus consisted of 3 ports: the center port was used to measure nose-poking by the rats to sample the cues, while the 2 side ports were used for obtaining rewards through choice responses. In the sample phase of this protocol, rats triggered the onset of an audiovisual cue (e.g., an image of a boy-shaped object with a 5 kHz sine-wave tone) by nose-poking the center hole and were required to maintain their nose-poke for at least 400 ms. This nose-poking behavior was trained during the shaping stage without a cue (see Methods for details). If a rat failed to maintain its nose-poke for 400 ms, the trial was stopped and the rat was allowed to retry the nose-poke after a 4-s interval (S1 Fig). After a successful (>400 ms) nose-poke, the cues disappeared, and doors covering left and right choice ports were opened simultaneously. In the response phase, rats were required to choose either the left or right port based on the sampled cue. In most trials, rats completed their choice responses within 600 ms (S2A Fig). A food reward was provided only after a correct choice response was made (reward phase), followed by a 2-s inter-trial interval. PPT PowerPoint slide PNG larger image TIFF original image Download: Fig 1. Multimodal object-recognition task. (A) Illustration of the apparatus and the trial structure of the multimodal object-recognition task. Rats sampled visual and auditory cues simultaneously or separately for 400 ms (sample phase) and then made a choice response based on the identity of the cue (response phase). A correct choice response resulted in a food reward (reward phase). (B) Object conditions used in the multimodal object-recognition task. Two different objects (Boy and Egg) were presented in 3 different modality conditions: multimodal (VA), visual (V), and auditory (A). The correct choice response was determined by the identity of the object. (C) Two simple visual cues were introduced as control (C) stimuli. Each control stimulus was also associated with either the left (C-L) or right (C-R) choice response (i.e., the same responses required by object conditions). https://doi.org/10.1371/journal.pbio.3002713.g001 To test the rat’s ability to recognize objects with multiple sensory modalities, we presented 2 different objects, Boy and Egg, consisting of different combinations of visual (images of a boy-shaped and an egg-shaped toy) and auditory (5 and 10 kHz sine-wave tones) attributes during the sample phase (Fig 1B). Objects were tested under 3 modality conditions: multimodal, visual, and auditory. In the multimodal condition, visual and auditory cues associated with an object were presented simultaneously during the sample phase. In unimodal—visual or auditory—conditions, only the object’s visual or auditory information was presented as a cueing stimulus. If the rat responded correctly to the object’s identity regardless of the modality condition, it was rewarded with a piece of cereal. The combination of audiovisual cues and stimulus-response contingency were counterbalanced across rats. In the control condition, rats learned to dissociate 2 simple visual stimuli composed of black and gray bars (Fig 1C). In this condition, the required left and right choice responses were the same as those in object conditions. The control condition was introduced primarily to exclude neurons that responded to a specific choice response in neural data analysis. In sum, 8 stimulus conditions were used in this task: 6 object conditions (2 objects × 3 modality conditions) and 2 control conditions. The PER is required for multimodal object recognition To test whether rats are able to perform the task when encountering the unimodal version of the multimodal condition for the first time following PER inactivation, we conducted a drug-inactivation experiment (n = 6). After training in multimodal and control conditions, rats were sequentially tested in separate sessions under multimodal, visual, auditory, and control conditions (Fig 2A). The order of visual and auditory sessions was counterbalanced across rats. For each condition, we first established baseline performance by injecting vehicle control (phosphate-buffered saline [PBS]) into the PER; we then tested performance in rats with an inactivated PER, achieved by injecting muscimol (MUS) bilaterally into the PER. Importantly, the sessions with PBS injections, either visual (V1) or auditory (A1) (Fig 2A), marked the first instances where rats were required to recognize objects originally learned multimodally, solely based on their unimodal sensory attributes. In a unimodal object recognition session, objects were presented multimodally (visual and auditory) for the first 20 trials. Subsequently, they were presented in a unimodal (visual or auditory) fashion for the remaining 100 trials, resulting in a total of 120 trials per session. PPT PowerPoint slide PNG larger image TIFF original image Download: Fig 2. Necessity of the PER for multimodal object recognition. (A) Illustration of behavioral training and testing schedules for the PER-inactivation experiment. Note that animals were subjected to either the visual or auditory condition for the first time in PBS-injected visual (V1) or auditory (A1) sessions. (B) Estimated learning in V1 (left) and A1 (right) sessions of an example rat. In trial 21, where visual or auditory conditions were first introduced, the rat quickly adapted without additional learning. (C) On average, correctness did not significantly change across trials within the V1 (left) or A1 (right) session, indicating that rats could perform unimodal retrieval without additional learning. Each trial block consisted of 20 trials. (D) Histological verification of injection sites in the PER. White dotted lines indicate the border of the PER. The numbers on each section indicate the distance from bregma. (E) Summary of cannula-tip locations in all rats. (F) Behavioral performance in each condition was compared between PBS and MUS sessions. Performance was significantly impaired in all object conditions (VA, V, and A) by inactivation of the PER but remained intact in the control (C) condition. (G) The latency median did not change significantly after inactivating the PER. (H) The average number of nose-poke attempts did not change significantly after inactivating the PER. Data are presented as means ± SEM (n = 6; *p < 0.05, #p = 0.062; n.s., not significant). Source data are available in S1 Data. PBS, phosphate-buffered saline; PER, perirhinal cortex; SEM, standard error of the mean. https://doi.org/10.1371/journal.pbio.3002713.g002 Performance dynamics of PBS-injected rats in visual and auditory sessions were displayed as learning curves, estimated from a given session (Fig 2B). Upon first encountering the visual or auditory condition (Trial 21), rats showed no significant drop in performance, and their performance remained stable until the end of the session. A statistical analysis of results for all PBS-injected rats revealed no significant increase or decrease in performance across all trial blocks (B1 to B6; 20 trials each) in either visual (F (5,25) = 0.95, p = 0.47) or auditory (F (5,25) = 0.22, p = 0.95; one-way repeated measures ANOVA) sessions (Fig 2C). These results indicate that rats easily recognized an object originally learned multimodally using one of its unimodal attributes, and this crossmodal recognition process required minimal training. To verify the necessity of the PER in the task, we examined the effect of MUS injection on task performance. Histological results confirmed that MUS was successfully bilaterally injected into the PER (Fig 2D and 2E). The average performance of rats (n = 6) in PBS sessions was significantly higher than predicted by chance (50%) in all conditions—multimodal (t (5) = 21.2 p < 0.0001); visual (t (5) = 7.8, p = 0.0005); auditory (t (5) = 13.1, p < 0.0001); and control (t (5) = 29.3, p < 0.0001)—as determined by one-sample t test. Inactivating the PER with MUS significantly decreased performance (F (1,5) = 165.4, p = 0.0006; two-way repeated measures ANOVA) (Fig 2F). The interaction of drug and stimulus conditions was not significant (F (3,5) = 1.99, p = 0.16, two-way repeated measures ANOVA). Further investigation into the effect of inactivation revealed performance deficits in multimodal (t (5) = 3.72, p = 0.028), visual (t (5) = 2.39, p = 0.062), and auditory (t (5) = 3.45, p = 0.027) conditions (paired t test with Holm–Bonferroni correction), but not in the control condition (t (5) = 0.37, p = 0.36; paired t test). Trial latency (i.e., from trial onset to end of choice) was not significantly affected by MUS injection (F (1,5) = 0.13, p = 0.73; two-way repeated measures ANOVA) (Fig 2G). Nose-poking behavior was not affected by PER inactivation, as the average number of nose-poke attempts was not significantly different between PBS and MUS sessions (F (1,5) = 0.92, p = 0.38, two-way repeated measures ANOVA) (Fig 2H). Collectively, these results demonstrate that the PER is necessary for object recognition in all modality conditions and that the decrease in performance is not attributable to deficits in general motor skills or to the loss of motivation. Both visual and auditory information processing modes are found during object-selectivity firing in the PER If PER neurons solely focus on the identity of an object and its associated behavioral response, object-selective patterns should remain constant irrespective of the modality condition. Conversely, it could be argued that distinguishing between events associated with experiencing an object based on its distinct modality information is crucial for episodic memory. To determine whether PER object cells can encode a particular sensory modality, we applied multiple linear regression to firing rates during the object-selective epoch (see Methods for details). In this regression model, β 1 and β 2 are regression coefficients that represent the visual and auditory responsiveness, respectively, of the preferred object (i.e., the object condition with higher firing rates). Visual and auditory information-processing neurons within the PER were identified based on the relationship between β 1 and β 2 (Fig 5A). An example of an object cell that predominantly fired for the visual attribute of Boy is cell #7 (Fig 5A-ii), which had higher firing rates in multimodal and visual conditions compared with the auditory condition. This pattern is reflected in higher β 1 versus β 2 values (p < 0.05; two-sided permutation test) (Fig 5A-iii). Cell #8, on the other hand, was responsive to the auditory attribute of Boy, as its firing rates in the multimodal and visual condition were higher compared with those in the visual condition (Fig 5A-ii); it also had higher β 2 than β 1 values (p < 0.05; two-sided permutation test) (Fig 5A-iii). A crossmodal cell type, distinct from the unimodal cell type described above that exhibited no significant preference for a particular sensory modality, was also observed (Fig 5B). An example of a crossmodal cell is cell #9, which exhibited almost equal firing in response to both sensory modalities of its preferred object (Boy) (Fig 5B-ii); its β 1 and β2 values were also similar (p ≥ 0.05; two-sided permutation test) (Fig 5B-iii). PPT PowerPoint slide PNG larger image TIFF original image Download: Fig 5. Unimodal and crossmodal response patterns of object cells in the PER. (A) Examples of unimodal cells that were responsive to either the visual or auditory attribute of an object during the selective epoch. Spike density functions (i) and mean firing rates within the object-selective epoch (ii). Multiple linear regression was applied to firing rates within the object-selective epoch to obtain β 1 and β 2 —regression coefficients reflecting the magnitude of visual and auditory responses, respectively (iii). Cell #7 mainly responded to the visual attribute of Boy (β 1 > β 2 ), whereas cell #9 was responsive to the auditory attribute of Boy (β 1 < β 2 ). (B) Spike density functions (i), mean firing rates (ii), and regression coefficients (iii) of a crossmodal cell. The cell showed no specific bias for visual or auditory information processing, as indicated by similar β 1 and β 2 values. (C) Scatter plot and histograms of visual (β 1 ) and auditory (β 2 ) coefficients in all object cells. Neurons were classified as either visual (cyan) or auditory (pink) cells if the difference between visual and auditory coefficient was significant. Others were classified as crossmodal cells (gray). (D) Visual and auditory coefficients of all object-selective cells were not significantly different. Each line indicates an individual object cell. (E) Proportions of visual, auditory, and crossmodal neurons within the object cell category. Visual and auditory cells were grouped as a unimodal cell type. The numbers in parentheses denote the number of neurons. (F) Anatomical locations of object cells along the anteroposterior axis of the PER and their unimodal (or crossmodal) response patterns. Differences between β 1 and β 2 did not exhibit a significant linear relationship with the anatomical locations of the cells. The dotted black line indicates the linear regression line, and the shaded area is the 95% confidence interval (n.s., not significant). Source data are available in S1 Data. PER, perirhinal cortex. https://doi.org/10.1371/journal.pbio.3002713.g005 To illustrate the patterns of modality correlates, we created a scatter plot of β 1 and β 2 values for all object cells (Fig 5C). We then verified that the PER system did not preferentially process one of the sensory modalities by first comparing β 1 and β 2 for all object cells (Fig 5D). This analysis showed no significant difference between β 1 and β 2 (W = 4794, p = 0.13; Wilcoxon signed-rank test), indicating that the PER did not have a significant bias toward a specific sensory modality. We then classified neurons based on the difference between their β 1 and β 2 values such that neurons whose β 1 values were significantly higher than their β 2 values were classified as visual cells, whereas those with significantly higher β 2 than β 1 values were classified as auditory cells (α = 0.05; two-sided permutation test). The remaining object cells were classified as crossmodal cells, exhibiting similar firing across both visual and auditory conditions. Although the majority of object cells were categorized as crossmodal (68%), both auditory cells (18%) and visual cells (14%) were identified (Fig 5E). The small difference in the proportion of visual and auditory cell categories was determined to be insignificant (χ2 = 0.89, p = 0.34; chi-square test). Detailed comparisons of selectivity patterns revealed that auditory cells exhibited stronger selectivity in the sample phase, and their selective period was longer than that of visual cells (U = 388.5, p = 0.03; Mann–Whitney U test) (S5 Fig). These findings suggest that modality information processing within the PER is heterogeneous, potentially enabling the retrieval of both object identity and its associated modality information. Since the PER receives direct inputs from visual and auditory cortices [22,23], it is possible that the activity of visual and auditory cells in the PER is driven solely by inputs from the sensory cortices. If so, the posterior PER, where visual inputs are relatively dominant, might have more visual cells, whereas the anterior PER, which receives more auditory inputs, might possess more auditory cells. To test this hypothesis, we examined the relationship between the anatomical locations of cells along the anteroposterior axis of the PER and differences between visual (β 1 ) and auditory (β 2 ) coefficients (Fig 5F). We found no evidence for regional bias in coefficients in the posterior PER that would indicate the dominance of visual processing over auditory processing. Instead, visual and auditory cell types were evenly distributed along the anteroposterior axis of the PER. There was also no significant relationship between the anatomical locations and peak selectivity time of each neuron (S7 Fig). These results suggest that the activities of visual and auditory cells in the PER do not solely rely on inputs from visual and auditory cortices, respectively. [END] --- [1] Url: https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3002713 Published and (C) by PLOS One Content appears here under this condition or license: Creative Commons - Attribution BY 4.0. via Magical.Fish Gopher News Feeds: gopher://magical.fish/1/feeds/news/plosone/