(C) PLOS One
This story was originally published by PLOS One and is unaltered.
 . . . . . . . . . .


LSD-induced increase of Ising temperature and algorithmic complexity of brain dynamics [1]

['Giulio Ruffini', 'Neuroelectrics Barcelona', 'Barcelona', 'Starlab Barcelona', 'Haskins Laboratories', 'New Haven', 'Connecticut', 'United States Of America', 'Giada Damiani', 'Diego Lozano-Soldevilla']

Date: 2023-03 

The study was approved by the National Research Ethics Service committee London-West London and was conducted in accordance with the revised declaration of Helsinki (2000), the International Committee on Harmonization Good Clinical Practice guidelines, and National Health Service Research Governance Framework. Imperial College London sponsored the research, which was conducted under a Home Office license for research with Schedule 1 drugs. Written informed consent was given, and data was de-personalized (anonymized) prior to analysis.

Our dataset consists of BOLD time series obtained under LSD effects and in a placebo condition from fifteen participants (N = 15). Details of the data acquisition process can be found in detail in [ 30 , 52 ], and we only provide a summary here. A group of 15 participants (4 women; mean age, 30.5±8.0) were recruited by personal spoken communication and provided with written consent to participate in the experiment after a briefing and screening study for physical and mental health. This included an electrocardiogram (ECG), urine and blood tests, plus a psychiatric test and drug history. Participants under 21 years, with psychiatric diseases, family history of psychotic disorders, experience with psychedelics, pregnancy, problems with alcohol (>40 units per week), or any other medically significant condition that could affect the study were excluded. Participants attended twice the laboratory (once per condition), with a separation of 2 weeks. The order of these was alternated across participants without giving the information about the experimental condition used in each session. With the help of a medical doctor, a cannula was inserted into a vein in the antecubital fossa after the tests. The dose consisted of 75 mg of LSD via a 10 ml solution proportioned over two minutes with an infusion of saline afterward. Similarly, a placebo (10 ml saline) was injected intravenously over two minutes. After this, participants were suggested to lie relaxed with closed eyes inside the MRI scanner. Participants provided subjective reports of the drug effects after a 5 to 15 minute period from the administration of the drug. The peak effects were reported to occur 60–90 minutes after the dose. Subsequently, effects were generally constant for four hours. The scanning was done approximately 70 minutes after the dose and lasted a full hour. BOLD scanning was composed of three resting state scans of seven minutes. The middle one incorporated listening to two excerpts of music from two songs by ambient artist Robert Rich. For our analysis here, we exclude the scan with music. Thus, the dataset includes four scans, each of 7 minutes (two under LSD, two in the placebo condition) for each of the fifteen subjects (for a total of sixty). The data from the two scans in each condition are combined in the following analysis.

The fMRI BOLD data was preprocessed using FSL tools (FMRIB’s Software Library, www.fmrib.ox.ac.uk/fsl ) with standard parameters and not discarding any Independent Component Analysis (ICA) components. This included corrections for head motion during the scans, which was within the normal, acceptable range for fMRI sessions. Also, FSL was used to generate parcellated (region-averaged) BOLD signals for each participant and condition, i.e., the signal was averaged over all voxels within each region defined in the automated anatomical labeling (AAL) atlas. Only the 90 AAL atlas cortical and subcortical non-cerebellar brain regions of interest (ROIs or parcels) were considered [ 53 ] (see Table A in S1 Text ). For each subject and condition (LSD or placebo), we obtained a BOLD time series matrix of dimensions 90 x 217 (AAL regions x time points, where the sampling cadence was 2 s) from each session. The preprocessing pipeline of the BOLD data and the generation of the BOLD time series is described in more detail in [ 52 ]. Fig J in S1 Text provides plots of the FC for each condition, and Fig L group provides the FC per subject.

To build an Ising model from the data or carry out complexity analysis, we transformed each parcel’s BOLD fMRI data series into a binary format. This was done by using a threshold: each value of the region-averaged BOLD time series was assigned a value of + 1 if its value was greater than the threshold and −1 otherwise. The threshold was set to the median of the time series in the parcel of interest. Hence the median of the thresholded time series of each voxel was zero, and the entropy of each time series was maximal. The binarization was carried out independently for each condition, session, subject, and parcel. The binarized data from the two scans in each session were concatenated.

The Block Decomposition Method (BDM) [ 69 ] combines the action of two universally used complexity measures at different scales by dividing data into smaller pieces for which the halting problem involved can be partially circumvented in exchange for a huge calculation based upon the concept of algorithmic probability and extending the power of the so-called Coding Theorem Method (CTM). The calculation, however, has been precomputed and hence can be reused in future applications by exchanging time for memory in the population of a precomputed look-up table for small pieces. BDM is used to account for both statistical regularities and algorithmic ones and is sensitive enough to capture small changes in complexity while invariant to different descriptions of the same object. It, therefore, complements the use of lossless compression algorithms to calculate an upper bound of algorithmic complexity. We carry out the same analysis as for ρ 0 described above, i.e., both on synthetic and experimental data.

To calculate a global measure of LZ complexity using the experimental data, we first concatenated the data from all subjects, conditions, and sessions, then, flattened it along the spatial dimension first and then time. Secondly, we compressed the flat binarized data with the LZ algorithm and saved the output “archetype dictionary”, which was used as the initial dictionary to compress the data from all the subjects in the two conditions, LSD and placebo (comparison by condition), and also per subject (comparison by condition and subject). The archetype dictionary was built by concatenating first the LSD data for all subjects and sessions and then the placebo data because we were forced to make a choice when flattening the array. However, we checked whether concatenating the placebo data before the LSD changed the results, and the difference was negligible. For details and code used to compute LZ complexity—we use the Lempel-Ziv-Welch (LZW) variant—see [ 55 , 68 ].

Two associated metrics derived from LZ compression are commonly used: c(n) and ℓ. Of the two, the latter is more closely related to Kolmogorov complexity or description length. A natural way to normalize this metric is to divide the description length by the original string length n, with units of bits per character. Here, we analyze synthetic lattice data generated by the Ising model by computing ρ 0 for each lattice realization. We also analyze ρ 0 by concatenating them for a global measure (with real data). The former allows for a plot of the mean and standard deviation of the measure with synthetic data generated using the Metropolis algorithm as a function of temperature.

After applying the algorithm to a string of length n, we obtain a set of words in a dictionary of length c(n) that can be thought of as a first complexity metric [ 65 ]. The description length of the sequence encoded by LZ can be approximated by the number of words seen times the number of bits needed to identify a word, and, in general, . We note that LZ provides an upper bound on algorithmic complexity, but, given its reduced programming repertoire (LZ is the Kolmogorov complexity computed with a limited set of programs that only allow copy and insertion in strings [ 66 ]), it fails, in general, to effectively compress random-looking data generated by simple, but highly recursive programs, e.g., an image of the Mandelbrot set (deep programs [ 40 , 67 ]). As a simple example of these limitations, consider a string concatenated with its bit-flipped, “time-reversed”, or dilated version. Such simple algorithmic manipulations will not be detected and exploited by LZ. Despite these limitations, LZ can be useful for studying the complexity of data in an entropic sense. To use LZW, we first transform numeric data into a list of symbols. Here we binarize the data to reduce the length of strings needed for the algorithm to stabilize. A reasonable strategy in the study of algorithmic information content in data is to preserve as much information as possible in the resulting transformed string. In this sense, using methods that maximize the entropy of the resulting series is recommended. Here, as discussed above, we use the median of the data as the threshold.

LZ has been used across many studies to characterize brain state—from neurodegeneration [ 57 – 60 ] to anesthesia [ 61 ], disorders of consciousness [ 62 , 63 ] to fetal consciousness [ 64 ]. The entropic brain hypothesis [ 43 ] proposes that within a critical zone, the entropy of spontaneous brain activity is associated with various dimensions of subjective experience and that psychedelics acutely increase both [ 39 ]. More recently, it has been proposed that algorithmic versions of entropy, which can be estimated by LZ and other methods such as BDM (discussed below), provide a more direct link with the subjective structure of experience [ 40 , 50 ].

Lempel-Ziv (LZ) refers to a class of adaptive dictionary compression algorithms that parse a string into words and use increasingly long reappearing words to construct a dictionary. Compression is achieved by pointing to the identifiers of words in the dictionary [ 54 – 56 ]. LZ algorithms are universally optimal—their asymptotic compression rate approaches the entropy rate of the source for any stationary ergodic source [ 56 ], i.e., . where ℓ is the length of the compressed string, n is the length of the string and is the entropy rate.

Ising spin model

Abstract frameworks from statistical physics can shed light on understanding emerging phenomena in large networks, such as phase transitions in systems where nodes—neurons or cortical columns here—interchange information under the assumptions of the maximum entropy principle [48, 49, 70]. The description of systems with many degrees of freedom can be summarized by coarse-graining variables (describing macrostates), which introduces statistics into modeling. At the so-called critical points, observable quantities such as diverging correlation length or susceptibility to external perturbations reveal singularities in the limit as the number of degrees of freedom goes to infinity. At these transitions, from order to disorder, there is a loss of sense of scale, with fractal properties in energy and information flow.

In this context, elements such as neurons, columns, or brain regions are modeled by spins (i.e., with two states, up or down, on or off) with pair interactions. The emerging statistical properties of large networks of these elements are studied under different conditions (temperature or an external magnetic or electric field). The prototypical simplest system in this context is the classical 2D Ising model, which features nearest-neighbor interactions and a phase transition. This model has been shown to be universal, i.e., that all the physics of every classical spin model (with more general types of interactions) can be reproduced in the low-energy sector of certain “universal” models such as the 2D Ising model [71]. This fact reflects the intrinsic computational power of near-neighbor interactions. In fact, Ising models have been shown to be universally complete [72, 73], with a map between any given logic circuit to the ground states of some 2D Ising model Hamiltonian. On the other hand, the fact the brain exhibits characteristics of criticality that may be modeled by systems such as the Ising model is now well established, with ideas that go back to pioneers such as Turing [74], Bak [1, 75], and Hopfield [76]. There is further evidence that the dynamics of the healthy brain occupy a sub-critical zone ([77], and see also [29] and references therein).

The Ising spin model used here arises naturally from model fitting of the binarized data (discussed in the next section) and is slightly more general than the original one. It is essentially the Sherringon-Kirkpatrick spinglass model [78] with an external field and allows for arbitrary pair, arbitrary sign interactions. It is defined by the energy or Hamiltonian of the lattice of N spins, (1) where σ i denotes the orientation of each spin in the lattice (±1), J ij is the coupling matrix or Ising connectivity (with pairs counted once), and h i is an external magnetic field applied independently at each site. In the context of analysis of parcellated BOLD data, σ i in this formalism represents the binarized state of each brain parcel and is equal to + 1 (−1) when the parcel is active (inactive), h i is a parameter that modulates the mean activity in a single parcel, and is a parameter that accounts for the interactions between parcels i and j (correlations), and which can be positive or negative.

Maximum entropy principle derivation of Ising model. The (spinglass) Ising model, which was originally proposed to describe the physics of ferromagnetic materials, can be seen to arise naturally from the Maximum Entropy Principle (MEP) [48]. The MEP is used to find an appropriate probability distribution given some constraints, e.g., derived from data. It states that the probability of finding a given spin configuration σ can be inferred from a simplicity criterion: given some data-derived constraints—which here will be that the mean spin and spin correlation values of the model should match that of the data—the probability function we should use is the one that fits the constraints but has otherwise maximal Shannon entropy. It turns out that the probability distribution function of observing a given spin configuration σ (bold symbols indicate an array) in thermal equilibrium (at unit temperature without loss of generality) is given by the Boltzmann distribution [5] (2) with given in Eq 1, and where ∑ {σ′} indicates a sum over all the 2N possible spin configurations.

Estimation of archetype parameters using maximum pseudo-likelihood. To create the global model archetype, we concatenate the binarized data from the fifteen participants and the two conditions (a total of four sessions) to generate a single dataset. We similarly produce condition (LSD and placebo) archetypes by concatenating the data in each condition. We estimate the model parameters J and h using an approximation to maximum likelihood as described in Ezaki et al. [5]. Briefly, we find the parameters that maximize the probability of observing the data given the model, (3) with (4) The likelihood maximum can be found by gradient ascent, with (5) (6) where the old/new superscripts refer to the values before and after updating, and with η a small positive constant. When the number of nodes in the system is large, the calculation of the likelihood from the model because computationally intractable. For this reason, it is approximated by the pseudo-likelihood, a mean-field approximation that approaches the true likelihood as the number of time points goes to infinity [5, 79], (7) with the modeled probability distribution for a given spin given the states of all the others, a quantity much easier to compute, (8) with (9) Using this approximation, the gradient ascent rule becomes (10) (11) where and are the 1-point and 2-point correlation functions with respect to distribution [5], (12)

Personalization of model—Individualized temperature. With h and J fixed in the archetype using the entire dataset, we adapt the archetype model for each subject and condition by changing the model temperature T = 1/β, that is, by writing (13) In this case, the gradient ascent algorithm becomes (14) with a fixed point at . This, as in the prior equations, can be seen by taking the derivative of the approximate log-likelihood with respect to the inverse temperature β, with the partition function, Z, given by The first term becomes the empirical average of the Hamiltonian, and the second the (data dependent) model mean (both up to a constant since we are not dividing by the number of measurements). It can be computed from Eq 12 and from (15)

[END]
---
[1] Url: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1010811

Published and (C) by PLOS One
Content appears here under this condition or license: Creative Commons - Attribution BY 4.0.

via Magical.Fish Gopher News Feeds:
gopher://magical.fish/1/feeds/news/plosone/