(C) PLOS One This story was originally published by PLOS One and is unaltered. . . . . . . . . . . Integrated neural dynamics of sensorimotor decisions and actions [1] ['David Thura', 'Groupe De Recherche Sur La Signalisation Neurale Et La Circuiterie', 'Department Of Neuroscience', 'Université De Montréal', 'Montréal', 'Québec', 'Jean-François Cabana', 'Albert Feghaly', 'Paul Cisek'] Date: 2022-12 Recent theoretical models suggest that deciding about actions and executing them are not implemented by completely distinct neural mechanisms but are instead two modes of an integrated dynamical system. Here, we investigate this proposal by examining how neural activity unfolds during a dynamic decision-making task within the high-dimensional space defined by the activity of cells in monkey dorsal premotor (PMd), primary motor (M1), and dorsolateral prefrontal cortex (dlPFC) as well as the external and internal segments of the globus pallidus (GPe, GPi). Dimensionality reduction shows that the four strongest components of neural activity are functionally interpretable, reflecting a state transition between deliberation and commitment, the transformation of sensory evidence into a choice, and the baseline and slope of the rising urgency to decide. Analysis of the contribution of each population to these components shows meaningful differences between regions but no distinct clusters within each region, consistent with an integrated dynamical system. During deliberation, cortical activity unfolds on a two-dimensional “decision manifold” defined by sensory evidence and urgency and falls off this manifold at the moment of commitment into a choice-dependent trajectory leading to movement initiation. The structure of the manifold varies between regions: In PMd, it is curved; in M1, it is nearly perfectly flat; and in dlPFC, it is almost entirely confined to the sensory evidence dimension. In contrast, pallidal activity during deliberation is primarily defined by urgency. We suggest that these findings reveal the distinct functional contributions of different brain regions to an integrated dynamical system governing action selection and execution. Funding: This work was supported by Canadian Institutes of Health Research ( https://cihr-irsc.gc.ca/e/193.html ) grants MOP-102662 and PJT-166014, the Canadian Foundation for Innovation ( https://www.innovation.ca/ ), Fonds de Recherche en Santé du Québec ( https://frq.gouv.qc.ca/en/ ), the EJLB Foundation ( www.ejlb.qc.ca ) to PC, and fellowships from the FYSSEN Foundation ( http://www.fondationfyssen.fr/en/ ) and the Groupe de Recherche sur le Système Nerveux Central ( https://www.grsnc.org/home ) to DT. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Here, we test this proposal by examining activity across all of the regions we recorded in the tokens task (PMd, M1, GPe, GPi, and dlPFC), but without a priori classifying cells into putative functional categories. Instead, we use the neural space approach pioneered in recent years [ 59 – 74 ], in which the entire system is described as a point in a very high-dimensional space defined by the activity of all recorded cells, and then reduced into a lower-dimensional representation that reveals the main factors governing cell activity across the system. We then perform specific analyses to characterize how activity unfolds during deliberation and commitment, comparing the dynamics of different regions, and quantify to what extent cell properties cluster into distinct functionally interpretable roles. Previous studies have used similar techniques to examine whether specific neural populations exhibit distinct [ 75 ] versus mixed selectivity [ 76 – 78 ] to various task-related variables. Here, we ask about the relationships between decision-making and movement. In particular, models with serial decision and action stages predict that neural activity will be separable into components related to deliberation and other components related to movement initiation/execution. In contrast, recurrent attractor models predict a single population in which all of these factors are mixed in a continuum. Some of these results have previously appeared in abstract form [ 79 – 82 ]. Recent neural recordings [ 36 , 43 , 44 ] have largely supported these proposals, but many questions remain. Are “decision-related” neurons part of a module for choosing a target, which sends its output to a separate module of “movement-related” neurons? Is commitment determined by the crossing of a neural “threshold”? Do the basal ganglia contribute to the deliberation process [ 39 , 45 – 49 ], or do they simply reflect a choice taken in cortical regions [ 50 , 51 ] and contribute only to movement execution [ 52 ]? Answering these questions is difficult given the heterogeneity of cell properties [ 8 , 32 , 53 – 56 ] and their apparently continuous distribution along rostro-caudal gradients [ 54 , 57 ] or cortical layers [ 58 ]. This leads one to consider whether, instead of serial modules, action selection and execution are two modes of a unified recurrent system distributed across the frontoparietal cerebral cortex and associated basal ganglia/thalamic loops. Previous studies have shown that both human and monkey behavior in the tokens task is well explained by the “urgency-gating model” (UGM) [ 14 , 16 , 42 ], which suggests that during deliberation, the sensory evidence about each choice (provided by the token distribution) is continuously updated and combined with a nonspecific urgency signal, which grows over time in a block-dependent manner, and commitment to a given choice is made when the product of these reaches a threshold. ( A ) During each trial of the “tokens task,” 15 tokens jump, one every 200 ms, from the central circle to one of two outer target circles. The subject’s task is to move the cursor (black cross) to the target that will ultimately receive the majority of the tokens. ( B ) Temporal profile of the “success probability” that a given target is correct. Once a target is reached, the remaining token jumps accelerate to one every 150 ms (“Slow” block) or 50 ms (“Fast” block). We subtract from movement onset the mean reaction time (RT), measured in a separate delayed-response task, to estimate commitment time (purple bar) and the success probability at commitment time (dotted red horizontal line). ( C ) Success probability for choosing the target on the right, in trial types defined on the basis of the success probability profile (see Methods ), here computed after aligning to movement onset (vertical black dashed line). The vertical purple bar indicates estimated commitment time. Solid curves: correct target on the right; dashed curves: correct target on the left. ( D) Recording locations. Medial is up. Dashed white lines indicate the estimated location of the globus pallidus. asl, lower limb of the arcuate sulcus; asu, upper limb of the arcuate sulcus; cs, central sulcus; ps, principal sulcus; spcd, superior precentral dimple. To disentangle neural activity related to deliberation, commitment, and movement, we trained monkeys to perform the “tokens task” ( Fig 1 ) (see Methods ). In the task, the subject must guess which of two targets will receive the majority of tokens jumping randomly from a central circle every 200 ms ( Fig 1A ). The subject does not have to wait until all tokens have jumped but can take an early guess, and after a target is reached, the remaining tokens jump more quickly (every 150 ms or every 50 ms in separate “Slow” and “Fast” blocks of trials). Thus, subjects are faced with a speed–accuracy trade-off (SAT)—to either wait to be confident about making the correct choice or to take an early guess and save some time, potentially increasing their overall reward rate. If we assume that commitment occurs shortly before movement onset, then we can delimit within each trial a period of deliberation ( Fig 1B ) during which neural activity should correlate with the sensory evidence related to token jumps as well as to subjective policies related to the SAT. Furthermore, because we can precisely quantify the success probability (SP) associated with each choice after every token jump, we can compute for each trial a temporal profile of the sensory evidence and categorize trials into similarity classes ( Fig 1C ), including “easy trials,” “ambiguous trials,” and “misleading trials” (see Methods for details). Here, we test whether neural activity in key cortical and subcortical regions exhibits properties that would be expected from a unified dynamical system for action selection and sensorimotor control. We focus on cells recorded in monkey dorsal premotor (PMd) and primary motor cortex (M1), which are implicated in both selection and control [ 8 , 10 , 32 – 36 ], as well as the dorsolateral prefrontal cortex (dlPFC), which is implicated in representing chosen actions [ 37 ]. In addition, we examine activity in the output nuclei of the basal ganglia, the globus pallidus externus (GPe) and internus (GPi), whose role in selection and/or motor control is under vigorous debate [ 38 – 41 ]. Importantly, we examine the activity of all of these regions recorded in the same animals performing the same reach selection task, making it possible to quantitatively compare activities in different brain areas using the same metrics. However, while the anatomical overlap between the distributed circuits of decision-making and sensorimotor control is well established, theoretical models of these processes remain largely separate. Decision-making is often modeled as the accumulation of evidence until a threshold is reached [ 11 – 19 ], at which time a target is chosen. Models of movement control usually begin with that chosen target, toward which the system is guided through feedback and feedforward mechanisms [ 20 – 22 ]. But if the neural circuits involved in action selection and sensorimotor control are truly as unified as neural data suggest, then theories of these processes should be similarly unified. One promising avenue toward an integrated account of selection and control is to consider both as aspects of a single distributed dynamical system, which transitions from a biased competition between actions [ 23 – 28 ] into an “attractor” that specifies the initial conditions for implementing the chosen action through feedback control [ 29 – 31 ]. During natural behavior, we are continuously interacting with a complex and dynamic world [ 1 , 2 ]. That world often does not wait for us to make up our minds about perceptual judgments or optimal choices, and inaction can lead to lost opportunities, or worse. Furthermore, we must often make decisions while we are already engaged in an action, such as while navigating through our environment or playing a sport [ 3 , 4 ]. These considerations suggest that the neural mechanisms involved in selecting and executing actions should be closely integrated within a unified sensorimotor control system [ 5 ]. Indeed, many neural studies have shown considerable overlap between the brain regions involved in action selection and sensorimotor control [ 6 – 10 ]. Could that continuity of properties be an artifact resulting from dimensionality reduction? That is, if distinct categories of cells with different functional roles really did exist, would our analyses be able to identify them? To address this question, we created a variety of synthetic populations of neuron-like units and applied to them the same analyses we used to examine real data, including PCA and GMM analyses of the resulting loading matrix. As described in the S1 Text (see Fig F in S1 Text ), this yielded three conclusions: It confirmed that PCA does correctly identify all of the components from which our synthetic neural populations were constructed. However, it also showed that some of the higher-order components (like PC5 in Fig 2 ) can result from PCA “cancelling out” some of the firing patterns already captured by lower-order components (e.g., PC2), in order to explain things like neurons tuned only during movement. Nevertheless, in all cases, the GMM analysis of the loading matrix correctly identified all of the actual categories of units in the synthetic populations. This suggests that the lack of distinct clusters we found in our neural data indeed reflects the absence of separate categories in the real neural populations. In S1 Text , we also discuss other methods that test for uniformity and clustering [ 75 , 76 ]. In summary, our analyses of loading matrices suggest that while different regions do appear to contribute to different subsets of PCs (e.g., the orthogonal relationship of dlPFC and GPi in Fig 8F ), the population within each region does not contain distinct clusters. In other words, while there is some structure in the loading matrix (e.g., Fig 8A and 8B , middle panel), the distributions of properties within each region are continuous. In contrast, populations in dlPFC, GPe, and GPi were well fit with a single gaussian ( Fig 8E and 8F ), though perhaps more structure would have been seen with larger numbers of cells. Importantly, there were significant and potentially functionally relevant differences between these regions. In particular, the distribution of loadings from dlPFC and GPi were nearly orthogonal to each other in the space of PCs 1, 2, and 4 ( Fig 8F ). The dlPFC was extended along PC2 and not along the other components, consistent with the proposal that it primarily carries information on the sensory evidence provided by token movements. In contrast, GPi was relatively narrow in PC2 and instead extended along PCs 3 and 4, related to the block and time-dependent aspects of urgency. Furthermore, there was a significant negative correlation (R = −0.32, p < 0.001) between GPi loadings on PC1 versus PC4 ( Fig 8E , right panel, purple). In other words, cells that build up over time (positive on PC4) tend to reduce their activity after commitment (negative on PC1), while cells that decrease over time (negative on PC4) tend to increase after commitment (positive on PC1). The distribution of loading coefficients for cells in PMd ( Fig 8A ) was highly distributed but not without structure, and was best fit with two gaussians. The first (1: 68% contribution to the fit) was dominated by cells only weakly contributing to PCs 1, 2, and 4 but strongly to PC3. The second (2: 32%) included cells that strongly contributed to PCs 1 and 2 but more weakly to PCs 3 and 4. The distribution for M1 was also best fit with two gaussians, one (1: 56%) contributing mostly to PC3 and another (2: 44%) contributing to 1, 2, and 4 but not 3. Thus, in both PMd and M1, there was a trend for cells that most strongly reflect the animal’s SAT (PC3) to be less strongly tuned to direction (PC2) and vice versa ( Fig 8A and 8C , middle panels). ( A ) Each point indicates the weight of the contribution of a given PMd cell to the principal component indicated on the axes. Colored ellipses indicate the centroid and 3 times standard deviation of each of the two gaussians (labeled 1 and 2) that provided the best fit to the distribution of these populations. ( B ) The same PMd gaussians in the 3D space of PC1, PC2, and PC4. ( C ) Same as A for M1. ( D ) Same as B for M1. ( E ) Same as A for cell populations in dlPFC (green), GPe (cyan), and GPi (purple), each of which was best fit with a single gaussian. Note that unlike A and C, here the right panel shows PC4 vs. PC1. ( F ) Same as B for dlPFC, GPe, and GPi. Data and code available at https://doi.org/10.6084/m9.figshare.20805586 . dlPFC, dorsolateral prefrontal cortex; GPe, globus pallidus externus; GPi, globus pallidus internus; M1, primary motor cortex; PC, principal component; PMd, dorsal premotor cortex. Another approach for inferring the putative functional contributions of different brain regions is to examine the distribution of the loading coefficients for cells in each region. For example, if a given population of cells is strongly related to the sensory evidence provided by token jumps, then cells in that population should tend to have higher magnitude loading onto PC2 than cells from another population that is less sensitive to evidence. As described in Methods , we characterized the distribution of loading coefficients for the first 11 PCs (which capture 95% of variance) by fitting the 11-dimensional space of points for each brain region with gaussian mixture models (GMMs) (for a related approach, see [ 77 ]). The results are shown in Fig 8 . A strong contrast to the cortical data is seen when we examine the neural state of cells recorded in the globus pallidus (GPe and GPi). In both regions, the neural state during deliberation is confined within a subspace that is not a thin manifold but instead resembles a ball compressed along PC2 and extended along PC4 ( Fig 7B and 7C ), which reflects the temporal component of urgency. During deliberation, activity in these regions tends to rise along PC4 but does not evolve in an orderly fashion along PC2, as seen in cortex. This does not appear to be caused by the lower number of cells recorded in GPe and GPi as compared to PMd and M1, because similar results hold when we restrict all regions to the same small number of cells (Fig Ca in S1 Text ). Nevertheless, by the time commitment occurs, the state of both GPe and GPi lies in a choice-specific subspace (purple ellipses) and then evolves quickly to a corresponding initiation subspace. These findings are consistent with our previous report of GPe/GPi activity in the tokens task, in which we suggested that these regions do not determine the choice but rather contribute to the process of commitment [ 44 ]. In contrast to PMd and M1, the deliberation manifold in dlPFC ( Fig 7A ) is almost exclusively extended along PC2 (Ψ = 0.564, like a cylinder whose length is 22 times its radius). Like PMd and M1, the dlPFC neural state shifts left and right along PC2 with sensory evidence but shows almost no component of elapsing time ( Fig 5 ), and after commitment, it exhibits only a small excursion into PC1. This suggests that neural activity in dlPFC primarily reflects the sensory evidence used to make decisions in the task, consistent with many previous studies [ 9 , 97 – 103 ]. It is also noteworthy that in both PMd and M1, the state reached at the moment of commitment (purple ellipses in Fig 6A and 6B ) shows an orderly relationships with the reaction time in each trial type (shortest in “early” trials, and longest in “late” trials). This is in agreement with the observation that even at a single trial level, a consistent relationship between neural state and reaction time can be observed during a simple instructed reach task [ 84 ]. Furthermore, in M1, the trajectories after that point converge to arrive in a relatively compact subspace (green ellipses) at movement onset—what Churchland and colleagues [ 96 ] called an “optimal subspace.” As shown in Fig 6C , the curved PMd manifold fits well (R 2 = 0.65) to a small sphere whose radius is approximately half of the manifold range (that is, the manifold wraps around nearly half of the sphere). By contrast, the planar M1 manifold produces a weaker fit (R 2 = 0.44) to a sphere with a much larger radius. To evaluate the robustness of this difference in manifold shapes, we performed a bootstrap analysis with 1,000 repetitions of spherical fits to the decision manifold (see Methods ) using cells randomly resampled from the original PMd and M1 populations. As shown in Fig 6D , bootstrapped PMd manifolds consistently fit well to a small sphere. While bootstrapped M1 manifolds were sometimes best fitted with a small sphere, these fits were always weak (R 2 between 0.3 and 0.4). Importantly, no points from either distribution fell within the 99% confidence ellipse of the other distribution (p < 0.001), indicating that the difference in manifold shapes was highly robust. In the Discussion, we propose that these shapes reveal functionally meaningful differences between the neural dynamics of these two cortical regions. Neural trajectories in the space of PCs 1, 2, and 4, plotted using activity during Slow blocks (Group 1 cells), separately for PMd ( A ; see S2 Movie ), and M1 ( B ; see S3 Movie ). Separate trajectories are plotted for easy, ambiguous, and misleading trials, as well as all trials in which decisions were shorter than 1,400 ms (early) and in which decisions were longer than 1,400 ms (late). For clarity, confidence intervals have been omitted. As in Fig 4 , trajectories are computed from data aligned on movement onset and extend from 1,400 ms prior. Blue (PMd) and red (M1) wireframes enclose all states before commitment (280 ms before movement). We also superimpose trajectories computed from data aligned on the start of the trial and until 500 ms later (projected through the same loading matrix). Dashed black arrows in panel B, left, indicate where these separate trajectories can be seen in the M1 space. In all panels, the trajectory of misleading trials in which the monkey correctly chose the right target is highlighted with a thicker line. Purple ellipses emphasize the time of commitment and green ellipses emphasize movement onset. ( C) Spherical and planar fits to the decision manifolds computed from PMd (left) and M1 (right). ( D) Comparison of spherical fits to bootstrapped data, assessed using the normalized radius and R 2 value of the best fit sphere. Solid lines indicate the 99% confidence ellipse for each resampled population, and large dots show the fits to the original data. Data and code available at https://doi.org/10.6084/m9.figshare.20805586 . M1, primary motor cortex; PC, principal component; PMd, dorsal premotor cortex. We can use these same region-specific PCs to construct region-specific neural space trajectories. Fig 6A shows this for the dorsal premotor cortex, where we see a structure that is quite similar to what was shown in Fig 4A for all neurons (not surprisingly since the PMd population is the largest). As before, we see a triangular decision manifold that is relatively thin (Ψ = 0.343) and extends along PC2 and PC4. However, note that it initially strongly leans in the negative PC1 direction and then curves around just before commitment (see the side view shown in Fig 6A , right). Interestingly, when plotted in the space of PCs 1, 2, and 4, the PMd decision manifold is curved as if it lies on the surface of a sphere ( Fig 6C , spherical fit R 2 = 0.65). In contrast, the decision manifold of M1 cortex ( Fig 6B ) is remarkably flat and thin (Ψ = 0.252) and leans into the positive PC1 direction. Nevertheless, the evolution of the neural state along the surface of the decision manifold in both regions obeys the same pattern shown in Fig 4 , proceeding from bottom to top as time elapses and shifting left and right with the sensory evidence, always lying within the same subspace, curved for PMd and planar for M1. Importantly, note that while PMd and M1 both strongly reflect the first four components described above, the other regions do not. In particular, dlPFC shows a clear relationship to evidence in PC2 and a slight block dependence in PC3, but it shows virtually no effect of elapsing time or commitment (both PC1 and PC4 are relatively flat). By contrast, in both GPe and GPi, PC1 shows a clear response at commitment, but PC2 shows no relationship to evidence until the decision is made. These findings suggest distinct functional roles for these regions. While the structure of the neural space computed across all neurons is interesting, it is still more informative to compare that structure across the different brain regions in which we recorded. Because the loading matrix produced by PCA provides coefficients that map each individual neuron’s contribution to each PC, we can calculate the weighted average of each region separately, as shown in Fig 5 . Fig 4C compares the decision manifolds between the two blocks, this time using only the cells that possess trials in all conditions (Group 3; see Methods ). Note that in the space of PCs 1, 2, and 4 the shape of the manifold is quite similar, with just a slight upward shift of the Fast relative to the Slow block. Fig 4D plots the same data in the space of PCs 2, 3, and 4. This effectively flattens the trajectories shown in panel C into the PC2-PC4 plane, which captures how the neural state changes as a function of evidence and elapsing time within each block. The difference between the blocks is captured by the orthogonal shift along PC3. Similar shifts in neural space have been reported to capture learning in motor cortex [ 95 ]. The flow of neural states upon the decision manifold is quite orderly, proceeding from bottom to top as time elapses and shifting left and right with the sensory evidence. For example, consider the misleading trials, in red, which clearly reveal the switch in sensory evidence. In effect, the flow of the neural state during deliberation resembles the temporal profile of evidence ( Fig 1C ) mapped onto that curved wireframe surface. The neural state continues to flow along the decision manifold until it reaches one of two edges (purple ellipses) at the time of commitment and then turns into PC1 and accelerates to rapidly flow along one of two paths, each corresponding to the choice taken, until movement initiation (green ellipses). In Fig 4A and 4B , we have drawn a gray wireframe around all of the points from the beginning of the trial until commitment time (280 ms before movement onset), across all trial types, thus defining the subspace within which deliberation occurs. This subspace resembles a triangular surface that is extended mostly in PC2 and PC4 and curved slightly into PC1. It is quite thin—for example, in the slow block, the value of Ψ (see Methods ) is 0.235, which is roughly equivalent to a triangular sheet whose thickness is 1/45th of the length of each side. We call this the “decision manifold.” Each panel shows data aligned on movement and computed from 1,400 ms before to 400 ms after, as well as data aligned on the first token jump and computed from 200 ms before to 500 ms after (using the same loading matrix). ( A ) Data from slow blocks (Group 1 cells). Shaded colored regions around each trajectory indicate the 95% confidence interval computed using bootstrapping. Dotted arrows indicate how the state of activity evolves over time. The gray wireframe encloses all states visited during the deliberation epoch (the “decision manifold”), across all trial types (Ψ = 0.235). Purple ellipses indicate the region in which commitment occurs (indicated for individual trial types by small colored circles), and green ellipses indicate the point at which movement is initiated. ( B ) Data from fast blocks (Group 2 cells), same format. Ψ = 0.403. ( C ) Comparison of the neural trajectories between the blocks, using cells that have data in all conditions (Group 3 cells). The blue wireframe shows the decision manifold for Slow blocks, and the red shows it for Fast blocks. To avoid clutter, we do not show the confidence intervals. ( D ) Same data as in (C) but projected into the space of PCs 2, 3, and 4. This facilitates comparison of the manifold and trajectories in the PC2-PC4 plane, as well as the contextual shift (blue double arrow) between the two blocks captured by PC3. Data and code available at https://doi.org/10.6084/m9.figshare.20805586 . PC, principal component. Fig 4A and 4B shows the trajectories of the different trial types, separately for the Slow and Fast blocks, plotted in the space of PCs 1, 2, and 4 (PC3 was not used because it is approximately constant in time, as shown in Fig 2 ). See S1 Movie for an animated version of this figure. As indicated in Fig 4A (dotted black arrows), in both block types, the neural activity evolves in a clockwise manner in the space of PC1 and PC4, passing over a region of deliberation until reaching a commitment state (purple ellipses), whereupon it rapidly moves to a movement-specific initiation subspace (green ellipses) and then turns back toward the starting point during movement execution. Some of these phenomena have previously been reported using neural space analyses of preparatory and movement-related activity in cortical regions during instructed reaching tasks with a single target [ 84 – 87 , 92 – 94 ]. Here, our task allows us to examine in more detail what happens during the process of prolonged deliberation, while subjects are selecting among multiple targets. Could this finding be a trivial consequence of overall cell tuning? To test this possibility, we used the Tensor Maximum Entropy (TME) method of Elsayed and Cunningham [ 91 ] to generate synthetic data sets that retain primary features such as tuning, but are otherwise random (see Methods ), and processed them in the same way as the real data, including cell duplication. As shown in Fig 3C , when PCA is applied to such synthetic data sets, it does not produce components that are as well correlated with evidence as PC2 from our true data (p < 0.01). The implication is that the emergence of PC2 in the real neural data requires a consistent relationship between how cells reflect the final choice (left versus right) and how they reflect the evidence that leads to that choice during deliberation (easy versus ambiguous versus misleading, etc.). ( A) Dashed lines show the sensory evidence in easy (blue), ambiguous (green), and misleading trials (red) calculated as the difference between success probability for the right target and 0.5. Shaded ribbons show the mean and 95% confidence interval of PC2 in the same trials (Slow block, Group 1 cells). The first vertical dotted line indicates commitment and the second indicates movement onset, on which all data are aligned. The evidence trace is delayed by 300 ms, which provides the best fit. Note that until the moment of commitment, the pattern of PC2 closely resembles the evidence, except for diverging toward one of the choices even in the absence of evidence during ambiguous trials. ( B) The same data, prior to commitment, plotted as evidence versus PC2 in these six trial conditions. The correlation coefficient is R = 0.9234 and p-value is well below 0.001. ( C) The distribution (N = 100) of correlation coefficients obtained by performing the same analysis on surrogate data sets generated using the Tensor Maximum Entropy approach (see Methods ). Here, each surrogate data set is represented by the highest correlation coefficient of any of the top 10 PCs against the profile of evidence. The mean R is 0.5294 (s.t.d. = 0.1615). For comparison, the red line shows the R value from the real data (panel B), and it is higher than all of the R values from surrogate data (p < 0.01). Data and code available at https://doi.org/10.6084/m9.figshare.20805586 . PC2 warrants special attention. Note that until the moment of commitment, it correlates very well with the evidence provided by the token movements. Indeed, as shown in Fig 3A and 3B , the correlation between the time-delayed evidence and the value of PC2 is highly significant (p < 10 −100 ) with a correlation coefficient of R = 0.92. This is notable because the PCA algorithm was not given any information about these different trial types (easy, ambiguous, misleading, etc.) but was simply given data averaged across four large trial groups that only distinguished left versus right movements and slow versus fast blocks. Nevertheless, when the resulting temporal profiles of PC2 are calculated for specific trial types they clearly reflect how the evidence dynamically changes over the course of deliberation in those trials. The first 4 components, which together explain 80.3% of total variance, are clearly interpretable in terms of the key elements of the UGM. The first PC (31.6% of variance) is nearly identical across all conditions and reflects the transition between deliberation (prior to commitment) and action (after movement onset). It is similar to the main condition-independent component reported in other neural space studies in both primates [ 84 – 88 ] and rodents [ 76 , 89 , 90 ]. The second PC (25.5%) exhibits two phases. Prior to commitment, it reflects the time course of the sensory evidence on which the monkey made his choice, distinguishing easy, ambiguous, and misleading trials. After commitment, it simply reflects the choice made, without distinguishing trial types. The third PC (13.4%) reflects the block-dependent aspect of urgency, distinguishing between the slow and fast blocks (even before the start of the trial). The fourth PC (9.7%) reflects the time-dependent aspect of urgency until just before movement onset. The remaining components are similar to PCs 2 and 4 during deliberation but capture some of the heterogeneity across cell activity patterns after commitment and movement onset. We discuss these higher PCs in the S1 Text . The cumulative variance explained by the first 20 components is shown at the bottom right, and the temporal profiles of the top 7 PCs are shown in the rest of the figure. Each of those 7 panels shows the average activity of all cells, weighted by their loading coefficient onto the given PC, for 12 trial groups, including combinations of 3 trial types: easy (blue), ambiguous (green), and misleading (red); two blocks: Slow (solid) and Fast (dashed); for choices made to the left or right (indicated for the components where they differ). Note that the sign is arbitrary because the loading matrix can have positive or negative values. Shaded regions indicate 95% confidence intervals computed using bootstrapping. In each panel, data are shown aligned to the start of token jumps as well as to movement onset. Commitment time is estimated to be 280 ms before movement onset, based on our prior studies [ 36 , 44 ]. Each panel is scaled to have the same range in the y-axis. PCs are built from the 402 cells that have trials in all conditions. Data and code available at https://doi.org/10.6084/m9.figshare.20805586 . The first 20 PCs together explained 97.9% of the variance in activity over time across the four groups of trials (Slow-Left, Slow-Right, Fast-Left, and Fast-Right) used for the PCA. Fig 2 shows the temporal profile for the first 7 PCs, separately for easy, ambiguous, and misleading trials in both slow and fast blocks, for both left and right choices, each constructed as a weighted average of the 402 cells that contained trials in all of these conditions (Group 3; see Methods ). Here, we show analyses performed using PCA on data aligned on movement onset (because we are interested in the dynamics of commitment) performed on all cells together (because we seek quantitative comparisons of the same PCs across regions). Note that because we recorded many more neurons in PMd and M1 than in the other regions, the variance explained was dominated by the PMd/M1 dynamics. However, reducing the number of cells to an equal number in each region did not change the results apart from making them noisier and changing the percentage of variance explained by individual PCs (Fig Ca in S1 Text ). Furthermore, we provided the PCA algorithm with trials grouped into only four large groups—right and left choices in the fast and slow blocks—and let further analyses reveal whether more specific subtypes (easy, ambiguous, etc.) can nevertheless be extracted from the results. Finally, we imposed symmetry on our population by following the “anti-neuron” assumption, which assumes that for every neuron we recorded, there exists a similar neuron with the opposite relationship to target direction (even for cells that are not tuned), effectively doubling the number of neurons. See the Methods section for the justification and motivation for this approach, and Fig E in S1 Text for analyses showing that this procedure does not alter our conclusions. To examine how the activity of this entire population evolves over time during the task, we performed principal components analysis (PCA) on cells recorded in all five brain regions, including any well-isolated neuron that was recorded in both Slow and Fast blocks (see Fig A in S1 Text for a schematic describing our approach). This included a total of 637 neurons, including 291 in PMd, 195 in M1, 53 in dlPFC, 45 in GPe, and 53 in GPi. In principle, there are many ways one can perform PCA. For example, one approach is to perform PCA separately on neurons from each region, analyzing the dynamics of each independently. Alternatively, one can perform PCA on all cells together, producing a “loading matrix” of coefficients (from each neuron to each PC) that allows one to “project” each population into the space of the same PCs. There are also many ways to align the data (on the start of the trial versus movement onset), different time periods one can include (only before movement, only after, or both), and different ways of grouping trials (into large averaged groups versus specific subtypes). Importantly, as shown in Fig B-E in S1 Text ), our results were remarkably robust regardless of these choices and held up when applied to data from each monkey separately, consistently leading to the same conclusions. Using factor analysis and gaussian process factor analysis (GPFA) [ 67 ] produced very similar results, so for simplicity, here we focus on PCA because it is the simplest and most readily interpretable. We recorded spiking activity from a total of 736 well-isolated individual neurons in the cerebral cortex and basal ganglia of two monkeys (S and Z), recorded at the locations shown in Fig 1D . Of these, 356 were recorded in PMd (237 from monkey S), 211 in M1 (79 from monkey S), 62 in dlPFC (60 from monkey S), 51 in GPe (19 from monkey S), and 56 in GPi (22 from monkey S). The properties of some of these neurons have been reported in previous publications, focusing on tuned activity in PMd and M1 and the basal ganglia [ 36 , 43 , 44 , 83 ]. Here, we additionally include neurons recorded in dlPFC, which, for technical reasons, were almost exclusively obtained in monkey S and to date only described in abstract form [ 79 , 80 ]. Discussion Many neurophysiological studies have suggested that decisions between actions unfold within the same sensorimotor regions responsible for the control of those actions [1,104–106]. This includes FEF and LIP for gaze choices [7,107–110] and PMd and MIP for reaching choices [8,36,110–114]. Several computational models of the decision mechanism suggest that it behaves like a “recurrent attractor system” [23–28], where reciprocally competing groups of neurons tuned for the available choices compete against each other until one group wins and the system falls into an attractor corresponding to a specific choice. The results reported here provide strong support for this class of models. In particular, the cells we recorded in PMd and M1 do not appear to belong to separate categories related to decision-making versus movement preparation or execution but instead behave like part of an integrated dynamical system that implements a biased competition and transitions to commit to a choice through a winner-take-all process. During deliberation, the pattern of cell activity in these regions is confined to a highly constrained subspace in the shape of a thin manifold and is shifted around within that manifold by the decision variables pertinent to the task (here, the sensory evidence and the rising urgency). When commitment occurs, the same group of cells now transitions from the decision manifold to a roughly orthogonal tube-shaped subspace corresponding to a specific choice [76] and then quickly flows to a subspace related to movement initiation [85,96]. This is precisely the kind of “winner-take-all” phase transition that occurs in recurrent attractor models. Importantly, the transition to commitment is not equivalent to the crossing of a threshold that is definable as the critical firing rate of a given neuron type, but instead resembles falling off the edge of the decision manifold into an orthogonal subspace that loses sensitivity to evidence. As can be seen in Fig 6A and 6B, that edge is oriented approximately diagonally with respect to PC2 and PC4, implying that straightforward summation of the activity of cells tuned to the chosen target may not be constant across trial types, as indeed often observed [36,115]. In other words, commitment is more nuanced than a threshold crossing; it is a state transition in a dynamical system [23]. Additional insights can be obtained by examining the low-dimensional components produced by dimensionality reduction. In particular, it is noteworthy that during deliberation, the four strongest components of neural activity (which together account for just over 80% of the variance) capture the key elements of the urgency-gating model [14]: the momentary evidence (in PC2), the urgency signal (a context-dependent baseline in PC3 and time-dependent ramping in PC4), and the transition to commitment (PC1). Perhaps even more important is how these components are differently expressed in the different cell populations. In particular, the neural state in dlPFC varies almost entirely along PC2 while, conversely, the state in GPe/GPi during deliberation is primarily determined by PC3 and PC4 and not at all by PC2. This suggests that information about sensory evidence is provided by prefrontal cortex [9,97–103] while the urgency signal is coming from the basal ganglia [44,116,117], and the two are combined in PMd and M1 to bias a competition between action choices [8,34,36,113,114,118]. The presence of the evidence-related component PC2 in the cortical data is particularly remarkable because the dimensionality reduction algorithm was not provided with any information about the variety of trial types (easy, ambiguous, etc.) that distinguish properties of deliberation but was merely given data averaged across four very large groups of trials: left or right choices during slow or fast blocks. Nevertheless, the difference of activity related to left versus right choices led the algorithm to assign a choice-related component that also happens to capture the evolving evidence for that choice. A neural population control method [91] verified that this is not a simple consequence of tuning. Additional analyses show that a unified evidence/choice component is obtained even if the PCA algorithm is only given data restricted to activity after movement onset (Fig Cb in S1 Text). This suggests consistency between cell properties before versus after movement onset [54], once again arguing against a categorical distinction between movement selection versus execution circuits and in favor of an integrated dynamical system, at least among the cells we recorded. Admittedly, the movements performed by our monkeys were quite simple and stereotyped, lacking the variation that would allow us to distinguish within our own data the neural components related to hand movement direction, joint angular velocities, muscle torques, or other execution-related variables. However, decades of previous neurophysiological recording studies have already established that neurons in PMd and M1 strongly reflect such variables [53,119–131] and are directly involved in the online guidance of reaching actions [132–135]. On the basis of such work, we therefore interpret the PMd/M1 activity we observed after movement onset as being related to the control of execution and not simply reflecting the target or goal. The emergence of the other components is also highly robust. If the PCA algorithm is only given data from the slow block, PC3 is lost (unsurprisingly), but the other components remain (Fig Cd in S1 Text). Also not surprisingly, if we provide PCA with data from all 28 of our trial classes, the components are even more clearly distinguished, even though the number of cells that possess all of the required trials is reduced by 37% (Fig Ce in S1 Text). In fact, any reasonable subset of data we have tried (e.g., from each animal separately) leads the PCA algorithm to identify the same functionally relevant components, albeit sometimes in a slightly different order depending on how much variance is captured by each. Finally, similar features are obtained if we leave out the cell duplication step (see Methods), although the resulting decision manifolds become asymmetrically distorted and some of the structure of loading matrices is more difficult to see (Fig E in S1 Text). Finally, it is noteworthy that our findings reliably reproduce many previous observations from very different tasks, including ones in which monkeys did not have to make any decision at all and were simply instructed to reach out to a single target. This includes the general flow of the neural state from target presentation to movement onset and offset [85] (Fig 4), the orderly relationship between reaction times and the neural state in PMd/M1 [84] (Fig 6A and 6B, purple ellipses), the compact subspace in M1 at movement onset [96] (Fig 6A and 6B, green ellipses), and the presence of condition-independent components related to state transitions and elapsing time [87] (Fig 2, PC1, PC4). Although most of our data come from a prolonged period of deliberation that is not present in those other studies, the phenomena related to preparation and execution, which are shared between paradigms, are nevertheless robustly reproduced. This further strengthens the proposal that action selection and sensorimotor control are two modes of a single unified dynamical system. Indeed, analyses of the loading matrix suggest that among the cells we recorded in PMd and M1, there is no categorical distinction between those involved in selection and those responsible for movement. Further analyses of synthetic populations (e.g., Fig F in S1 Text) demonstrate that some of the higher order PCs we observed (PC5 and PC6; see Fig 2) may result from the heterogeneity of properties across our cell population, which includes purely decision-related and purely movement-related cells. However, analysis of the loading matrix concluded that these properties are not clustered into distinct categories as in synthetic data (Fig Fd in S1 Text) but are instead distributed along a continuum (Fig 8). While some of these findings could have been anticipated from analyses of individual cells, one important observation that would not have been possible concerns the shape of the decision subspace in the cortical populations. In particular, it is highly consistent across all of the different trial types and always resembles a thin manifold. This suggests strong normalization dynamics that are an inherent feature of recurrent attractor models. That is, the state of neural activity is pushed to lie on a surface that conserves some quantity (e.g., total neural activity with respect to some baseline) but is then free to move upon that surface under the influence of evidence, urgency, or simply noise. Furthermore, the particular shape of the manifold reveals consistent differences in the dynamics of different neural populations. In particular, regardless of what data we provide to PCA, the M1 manifold is always almost perfectly flat while the PMd manifold always exhibits a characteristic curvature. Interestingly, a very similar curvature was observed in preliminary analyses of PMd data in a very different decision task [136]. In contrast, the decision subspaces in the globus pallidus are much more compact and nothing like the thin manifolds in cortex. Recent work suggests that the shape of the manifold on which neural states lie can reveal important features of the underlying dynamics of the system [70–72,137–142]. We believe the same holds for our data, particularly regarding the difference between the shapes of the decision manifolds in PMd versus M1. Notably, like many of our findings, this difference in manifold shapes is remarkably robust. It is observed whether or not we impose symmetry through cell duplication (Fig E in S1 Text, and S4 Movie), whether or not we use a square-root transform (Fig Cc in S1 Text), and whether we use data from each monkey separately (Fig D in S1 Text). It is significant at p < 0.001 according to a bootstrap test that randomly samples cells from each population (Fig 6D). While the consistency of this difference in manifold shapes suggests that it reveals something important about the distinct neural dynamics of PMd versus M1, identifying the functional meaning of this difference is challenging. Many plausible possibilities exist, and here we take inspiration from previous work on computational models of recurrent neural circuits [23,26,28,143] to discuss interpretations that address two aspects of these shapes: curvature in the PC1-PC2 plane and curvature in the PC1-PC4 plane (Fig 6A and 6B). We suggest that curvature in the PC1-PC2 plane reveals nonlinear interactions between competing cells in a given cortical region. As described in S1 Text, in a simple 2-neuron attractor model, the flow field of activity during deliberation is strongly constrained to lie within a narrow manifold whose shape is influenced by the function that governs mutual inhibition between competing cells (Fig G in S1 Text). If that function is flat for low inputs and then steeply rises for high inputs (approximating winner-take-all dynamics), then the manifold is flat as we observed in M1. In contrast, if the function resembles a shallow sigmoid, then the manifold is curved as we observed in PMd, and as others have reported in parietal cortex [137]. This could be related to the proposal, made in earlier modeling work [25], that PMd implements a gradual competition between groups of cells related to different movements, while M1 behaves more like a winner-take-all system. In addition, we observed that the PMd manifold bends in the PC1-PC4 plane, shifting the direction of flow along the component associated with the transition from deciding to acting. Importantly, the timing of the shift foreshadows the moment of commitment, proceeding it by approximately 200 ms. We propose that these features could be the signature of a gradually emerging positive feedback in the cortico-striatal-thalamo-cortical circuit, which gradually overcomes inhibitory signals preventing premature selection [144]. This hypothesis would predict a correlation between how directional selectivity begins to emerge in GPi and how the PMd state begins to flow toward commitment. Indeed, as shown in Fig H in S1 Text, such a correlation is very strong in our data. In particular, the speed at which the PMd population moves toward the commitment subspace (derivative of PC1 computed from PMd) is significantly and positively correlated with the emerging directional selectivity in GPi (absolute value of PC2 computed from GPi), consistent with positive feedback between these regions [143,145,146]. Preliminary modeling work [145] suggests that such positive feedback emerges when a critical contrast is reached between opposing groups of cells in PMd. This predicts that microstimulation in cortical regions will disrupt that critical contrast and delay movement onset if delivered just before (but not after) the moment of commitment, as recently confirmed [147]. However, establishing the causal relationships in the full recurrent circuit will require many additional future studies, including simultaneous microstimulation in one region and recording in the other. In conclusion, our analyses support the hypothesis that decisions between actions emerge as a competition within the sensorimotor system [8–10,25,107,109,110], which is governed by recurrent attractor dynamics [23–28], as supported by optogenetic stimulation studies in mice [148,149]. According to this model [145], the competition for reach selection occurs between cortical cell groups associated with different candidate reaching actions within arm-related regions of PMd and M1 and is biased by decision-related information coming at least in part from the prefrontal cortex [9]. As time passes, that competition is invigorated by a context-dependent urgency signal from the basal ganglia [44,116,117], which gradually amplifies the competitive dynamics in PMd/M1. As the contrast develops between the activity of cells voting for the different actions, a congruent bias is gradually induced in the striatum and pallidum, leading to a positive feedback that results in a winner-take-all process [44,143,146], which constitutes the commitment to an action choice. That process then brings the cortical system to a state suitable for initiating the selected action [96,150], setting into motion the “first cog” of a dynamical machine that controls our actions in the world [29–31]. [END] --- [1] Url: https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3001861 Published and (C) by PLOS One Content appears here under this condition or license: Creative Commons - Attribution BY 4.0. via Magical.Fish Gopher News Feeds: gopher://magical.fish/1/feeds/news/plosone/