(C) PLOS One This story was originally published by PLOS One and is unaltered. . . . . . . . . . . The rearing environment persistently modulates mouse phenotypes from the molecular to the behavioural level [1] ['Ivana Jaric', 'Animal Welfare Division', 'Vetsuisse Faculty', 'University Of Bern', 'Bern', 'Bernhard Voelkl', 'Melanie Clerc', 'Department Of Biology', 'Institute Of Microbiology', 'Swiss Institute Of Bioinformatics'] Date: 2022-11 The phenotype of an organism results from its genotype and the influence of the environment throughout development. Even when using animals of the same genotype, independent studies may test animals of different phenotypes, resulting in poor replicability due to genotype-by-environment interactions. Thus, genetically defined strains of mice may respond differently to experimental treatments depending on their rearing environment. However, the extent of such phenotypic plasticity and its implications for the replicability of research findings have remained unknown. Here, we examined the extent to which common environmental differences between animal facilities modulate the phenotype of genetically homogeneous (inbred) mice. We conducted a comprehensive multicentre study, whereby inbred C57BL/6J mice from a single breeding cohort were allocated to and reared in 5 different animal facilities throughout early life and adolescence, before being transported to a single test laboratory. We found persistent effects of the rearing facility on the composition and heterogeneity of the gut microbial community. These effects were paralleled by persistent differences in body weight and in the behavioural phenotype of the mice. Furthermore, we show that environmental variation among animal facilities is strong enough to influence epigenetic patterns in neurons at the level of chromatin organisation. We detected changes in chromatin organisation in the regulatory regions of genes involved in nucleosome assembly, neuronal differentiation, synaptic plasticity, and regulation of behaviour. Our findings demonstrate that common environmental differences between animal facilities may produce facility-specific phenotypes, from the molecular to the behavioural level. Furthermore, they highlight an important limitation of inferences from single-laboratory studies and thus argue that study designs should take environmental background into account to increase the robustness and replicability of findings. Funding: This study was supported by the Swiss National Science Foundation (grant 310030_179254) to H.W and core funding from ETH Zürich to the Laboratory of Microbiome Research headed by S.S. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Data Availability: The ATAC-seq data are available from the NCBI Gene Expression Omnibus (GEO) database under accession number GSE191125. The 16S rRNA gene sequencing data are available from the European Nucleotide Archive (ENA) under accession number PRJEB49361. All other relevant data supporting the findings of this study are available within the article and its supporting material. The code files for the ATAC-seq analysis are available at: https://github.com/MWSchmid/Jaric-et-al.-2022 . The codes used for analysis of behavioural and physiological responses and randomization are available at the Figshare repository https://doi.org/10.6084/m9.figshare.21088642 . Copyright: © 2022 Jaric et al. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. We generated a comprehensive data set by exploring the extent and persistence of variation in the gut microbiome, associated variation in physiological and behavioural traits, and changes in neuronal chromatin organisation as a molecular substrate by which phenotypic differences in behaviour might be mediated. Our study demonstrates that the common environmental differences between RFs can produce facility-specific phenotypes, from the molecular to the behavioural level, thereby compromising replicability of research findings. (a) Schematic illustration of the multicentre study design—genetically homogeneous mice originating from a single inbred stock were reared until the age of 8 weeks in 5 different RFs before testing for phenotypic differences induced by the different rearing conditions in a single testing laboratory. ( b) Effects of the RF were evaluated at 2 TPs, first, at the end of the rearing period in each of the 5 RFs (TP1) and during the testing period in the testing laboratory (TP2). Outcome measures assessed at both TP1 and TP2 included gut microbiota, body weight, and chromatin profiles using ATAC-seq, while behavioral tests (open field and light dark box tests) and physiological measures of stress (HPA axis reactivity test (SRT) and relative adrenal weight were limited to TP2). Values for α-diversity metrics for ( c) male mice and ( d) female mice from different RFs (n = 6 mice/sex/RF/TP). ( e-g) Ordination plots visualizing PCoA based on Bray–Curtis dissimilarity between samples of male (e) and female (g) mice from different RFs, split by TPs. ( f-h) Differences between loadings of samples on the first 3 PCoA axes in male (f) and female (h) mice. Box plots show the first and third quartiles; horizontal line represents the median; whiskers represent the mean variability outside the upper and lower quartiles. Individual points represent outliers. TP1: 8 weeks of age (PND 56); TP2: 14.5 weeks of age (PND 104). The raw data underlying this figure are available in the Figshare repository https://doi.org/10.6084/m9.figshare.21082195 . The 16S rRNA gene sequencing data are available from the ENA under accession number PRJEB49361. ATAC-seq, assay for transposase-accessible chromatin using sequencing; ENA, European Nucleotide Archive; HPA, hypothalamus–pituitary–adrenal; PCoA, principal coordinate analysis; PND, postnatal day; RF, rearing facility; SRT, stress reactivity test; TP, time point. Thus, pregnant C57BL/6JRj female mice from a single breeding population (Janvier Labs, Le Genest-Saint-Isle, France) were randomly allocated and transported to 5 independent RFs, where their offspring were born and reared until 8 weeks of age under the facility-specific housing and husbandry conditions ( S1A Table ). In order to assess the effect of the rearing environment independently of genotype and test conditions, both male and female offspring from all 5 RFs were then transported to a single test laboratory that was new for all mice ( S1B Table ). Phenotypic differences were assessed at 2 time points (TPs), shortly before shipping the mice from the RF to the test laboratory (TP1) and after 2 weeks of habituation, over the course of a 4-week testing period in the test laboratory (TP2; Figs 1A and S1 ). Specifically, we examined the extent and persistence of variation in the composition of the gut microbiota associated with the different RFs and measured differences in phenotypic traits such as body weight, adrenal weight, neuroendocrine stress reactivity, and behaviour ( Fig 1B ). In addition, we assessed differences in neural chromatin accessibility to explore the potential biological basis of behavioural differences ( Fig 1B ). The study protocol was preregistered (10.17590/asr.0000201) and is further detailed in the Methods section. In animal research, replicability of experimental findings is complicated by phenotypic plasticity, i.e., the ability of one genotype to exhibit different phenotypes under different environmental conditions [ 12 ]. Therefore, phenotypic plasticity increasingly receives attention as an important factor compromising the external validity and replicability of animal research [ 13 – 21 ]. Whereas genotypic differences can be eliminated by selective breeding [ 22 – 24 ], the environment in which laboratory animals are born and grow up may differ substantially between rearing facilities (RFs) [ 25 – 27 ]. As a result, genotype-by-environment interactions throughout ontogeny can lead to phenotypic differences in morphology, physiology, and behavior between animals reared in different environments. Due to such phenotypic plasticity, researchers may fail to replicate research findings, even when using genetically homogeneous (inbred) animals [ 28 – 30 ]. Therefore, phenotypic plasticity may contribute to replication failure and conflicting findings in the scientific literature [ 13 , 25 , 31 ]. However, the magnitude of this problem is as yet unknown, as existing evidence is generally based on single-laboratory studies [ 32 – 34 ] and experimentally induced environmental interventions. Here, we sought to determine the extent to which common differences in housing and husbandry conditions between RFs modulate the phenotype of the most commonly used inbred strain of laboratory mouse, C57BL/6J. To do so, we used a systematic multicentre approach, whereby C57BL/6J mice from the same breeding stock were allocated to and reared in 5 independent animal facilities, before being transported to a single test laboratory. This approach allowed us to assess the effect of the rearing environment on the phenotype of the mice independent of both genotype (e.g., genetic differences due to genetic drift when mice are obtained from different breeders or different breeding stocks) and test condition (i.e., environmental factors affecting the mice at the time of testing). The ability to replicate an observation by an independent study is a cornerstone of the scientific method to distinguish robust evidence from anecdote [ 1 ]. However, the replicability of original research findings was found to be poor across virtually all disciplines of research [ 2 , 3 ], including preclinical animal research. Thus, the prevalence of irreproducible findings in preclinical research was estimated to be greater than 50% of the published findings [ 4 ]. Poor replicability compromises the credibility of animal research, attenuates scientific advances and medical progress, harms animals for inconclusive research, and puts patients in clinical trials at risk. Irreproducibility in biomedical research has mostly been attributed to violations of good research practice, including poor study conduct (e.g., no blinding, no randomisation), small sample sizes resulting in low statistical power, analytical flexibility (including p-hacking and HARKing), selective reporting of findings, and publication bias [ 2 , 3 , 5 – 11 ]. The KEGG analysis highlighted an overrepresentation of genes belonging to adherens junction, dopaminergic synapses, hippo, apelin, and insulin resistance signaling pathway. These pathways, which have an important role in maintaining hippocampal development, morphology, and plasticity, were significantly affected by the RF at both TPs and likely have functional consequences for behavioural regulation [ 72 – 75 ]. Enrichment of genes relevant to neurotrophin, long-term depression and potentiation, serotonergic and glutamatergic synapse pathways was evident only at TP1 (Figs 4C and S4A ). These differences diminished after the mice had spent 6 weeks in the test laboratory. At TP2, we noticed significant differences for Notch, prolactin, relaxin, and AMPK signaling pathways, indicating their potential role in behavioral adaption to the new environment (Figs 4C and S4A ). GO analysis of genes showing differential chromatin accessibility between the different RF presented for each TP ( a-b ); KEGG pathway analysis of the genes showing differential chromatin accessibility between the different RFs for each TP. Fields marked with an asterisk depict comparisons that were statistically significant between TPs (two-sided Fisher’s exact test, adjusted for multiple testing, FDR < 0.05). FDR, false discovery rate; GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes; RF, rearing facility; TP, time point. Next, we generated lists of all differentially accessible regions (DARs) in the ventral hippocampus of mice from different RFs and mapped them to their adjacent genes for all comparisons at both TPs separately ( S2 Data ). Among the genes with the highest fold change, we, for instance, found Col19a1 (encoding collagen type XIX α1 chain), where chromatin accessibility differed between RFs and remained consistent across both TPs (Figs 3G and S5E ). This indicates that persistent chromatin regulation occurred during the rearing period in response to the specific environment of the RF in genes necessary for hippocampal synapse formation [ 59 – 61 ]. We also found DARs between RFs that changed between TP1 and TP2 and mapped them to genes such as Dlg 2 (discs large homolog 2, also known as postsynaptic density protein-93 (PSD-93), Fzd9 (encoding Frizzled9, one of the Wnt receptors), and Lrrc4c (encoding Leucine-Rich Repeat-Containing 4C). Changes in these genes, important for postsynaptic plasticity (Figs 3G and S5E ) [ 62 – 65 ], indicate that mice from different RFs were using different chromatin regulation strategies to adapt to the new environment of the test laboratory. In terms of genomic features, the most accessible sites were preferentially located within the promoter regions, further corroborating the potential functional significance of the observed changes, while the less accessible sites were mainly located in the intergenic regions and introns ( Fig 3C and 3E ). In addition, the number of genes associated to the most accessible peaks were common between subjects, whereas less accessible sites and associated genes behaved much more randomly and decreased with the number of selected samples ( Fig 3D ). (a) Sample correlation matrices based on all sites and the 10% most variable ATAC-seq sites (n = 5 mice/RF/TP). ( b) Manhatten distances between samples for all sites and open chromatin sites visualised by t-SNE. ( c) ATAC-seq accessibility signal in response to the different RFs and TPs. Heatmap representation of the most and least accessible sites. The color represents the intensity of chromatin accessibility, from gain (yellow) to loss (dark blue), calculated by using row wise Z-scores (the values are scaled by subtracting the average across samples and by dividing by the standard deviation across samples). ( d) Bar graphs representing the number of genes associated with the most and least accessible peaks. ( e) Spidergraph representing the genomic features mapped by all (black), open (blue), or closed (red) sites. (f) Chromatin accessibility profiles of Col19a1, Dlg 2, Fzd9, and Lrrc4c. The raw data underlying this figure are available from the NCBI GEO database under accession number GSE191125. The analysis script is available at the GitHub repository https://github.com/MWSchmid/Jaric-et-al.-2022 . ATAC-seq, assay for transposase-accessible chromatin using sequencing; GEO, Gene Expression Omnibus; RF, rearing facility; TP, time point. We found that most samples clustered based on RF, indicating pronounced differences in chromatin accessibility. This pattern was observed by looking at both the overall dissimilarity of all ATAC-seq profiles and the 10% most variable peaks ( Fig 3A ). RF explained 55.33% and 36.79% of overall variation at TP1 and TP2, respectively (Figs 3B and S5 and S10 Table ). Remarkably, variation explained by RF was much larger in the open chromatin sites (77.5% at TP1 and 70.9% at TP2) than in the closed sites (48.2% at TP1 and 28.4% at TP2), suggesting that these differences have functional consequences ( Fig 3B and S10 Table ). We next explored at the molecular level whether environmental differences between RFs affect epigenetic mechanisms in brain areas involved in behavioural control. More specifically, we assessed chromatin plasticity since chromatin is considered as an interface between the environment and the genome and plays a key role in the regulation of gene expression, mediating various aspects of plastic behavioural responses to environmental perturbation [ 54 , 55 ]. Since chromatin organisation is tissue and cell type specific, we performed the assay for transposase-accessible chromatin using sequencing (ATAC-seq; [ 56 ]) on neuronal nuclei extracted from the ventral hippocampus, a brain area involved in modulating emotional behaviour and stress responses in mice [ 57 , 58 ]. This analysis was limited to males, as they showed more pronounced phenotypic variation, especially in behavioural traits. Further, we predicted that differences in the environmental conditions between RFs during the late prenatal, early postnatal, and adolescent period may differentially shape the reactivity of neuroendocrine systems, including the reactivity of the hypothalamus–pituitary–adrenal (HPA) axis to stressors later in life [ 47 – 50 ]. We, therefore, examined whether RF altered the animals’ HPA stress reactivity by measuring changes in plasma corticosterone during and after a brief period (20 minutes) of physical restraint in a plastic tube. However, we found no consistent differences in HPA stress reactivity between mice reared in different RFs ( Fig 2C and S8 Table ). In males, there was a strong effect of litter size on basal corticosterone levels, and group size after weaning strongly affected both basal levels and acute response levels of corticosterone ( S8 Table ). As expected, corticosterone levels in females were almost double those in males [ 51 , 52 ] ( Fig 2C ), and the oestrous cycle stage had a strong effect on basal corticosterone levels ( S8 Table and S4 Fig ). However, we found that RF had a strong effect on adrenal weight, at least in males ( Fig 2D and S9 Table ). These results suggest that the chronic stress engendered by standard housing conditions and husbandry procedures induced changes in adrenal gland morphology and function, which may have buffered the neuroendocrine stress response to acute stressors [ 53 ]. Using linear discriminant function analysis (LDA) on the combined behavioural data ( S6 Table ), we were able to correctly classify 58% of all male mice and 53% of all female mice according to their RF, which is substantially more than the 20% predicted by chance (Χ 2 = 55.1, p = 1.14 × 10 −13 and Χ 2 = 41.7, p = 1.08 × 10 −10 , respectively, for males and females; Fig 2B and S7 Table ). In males, the first 2 discriminant functions together explained 79% of the variation between RFs, whereby the coefficients of the discriminant functions indicate that distance travelled in the OF and time in the light compartment in the LDB, 2 main measures of exploration and emotionality, contributed most to the first function, while time in the light compartment and number of entries into the light compartment in the LDB contributed most to the second function. In females, the first 2 functions together explained even 90% of the between-facility variance. Distance travelled in the OF contributed most to the first function, while time in the centre in the OF contributed most to the second function. These findings demonstrate that common environmental differences between animal facilities during the first 8 weeks of postnatal development can substantially alter key aspects of the behavioural phenotype of mice, which persist into adulthood. To analyse phenotypic differences in behaviour, we combined variables derived from 2 standard behavioural tests (open field (OF) and light–dark box (LDB)) in a multivariate analysis of variance (MANOVA). We found that RF had a strong effect on the behavioural phenotype, explaining 14.4% and 12.4% of the total variation in males and females, respectively. After controlling for fixed effects, RF even explained 23% and 25% of the remaining variation in behavioral outcomes (partial η2 estimate). Moreover, whereas variation in litter size, litter sex ratio, and group size after weaning had no effect on behaviour in both males and females, oestrous cycle stage on the test day had a strong effect in females ( S5 Table and S3A–S3C Fig ). (a) Body weight persistently varied by RF in both males and females (n = 6 mice/sex/RF for TP1; n = 24 mice/sex/RF for TP2). ( b) Behavior of the mice varied consistently by RF both in males and females. In the LDA plots, color indicates RF, and the circles represent classification based on discriminant function analysis (n = 12 mice/sex/RF). ( c) RF did not affect plasma corticosterone levels in the SRT both in males and females (n = 12/mice/sex/RF), while relative adrenal gland weights (n = 24/mice/sex/RF) were affected only in males (d) . Box plots include individual data points and show the first and third quartiles; horizontal line is the median; whiskers represent the variability outside the upper and lower quartiles. TP1: 8 weeks of age (PND 56); TP2: 14.5 weeks of age (PND 104). The raw data underlying this figure are available in the Figshare repository https://doi.org/10.6084/m9.figshare.21081949 . LDA, linear discriminant function analysis; PND, postnatal day; RF, rearing facility; SRT, stress reactivity test. To assess phenotypic differences between mice from different RFs, we first measured the body weight of the mice, which often correlates with other physiological variables and can potentially serve as latent variable in studies on mouse energy metabolism and metabolic diseases [ 44 – 46 ]. Although body weight is a highly heritable trait, we found that genetically homogeneous mice reared in different RFs differed markedly in body weight, and these differences persisted at the test laboratory throughout the experiment. RF was the only factor that had a strong and persistent effect on body weight in both males and females, while variation in litter size, litter sex ratio, and group size after weaning had no significant effects ( Fig 2A and S4 Table ). This finding is in line with the recent findings of Corrigan and colleagues [ 44 ], who found that the institution where experiments were conducted was among the main factors influencing metabolic rate and body weight in C57BL/6J mice. Next, we evaluated the differences in the microbiome composition (β-diversity) based on the Bray–Curtis dissimilarity between samples [ 40 ]. Overall, we found pronounced differences between mice reared in different RFs. When assessing the amount of variation in the data explained by RF at each TP, the effect was most pronounced at TP1, accounting for 28.7% of overall variation in males ( Fig 1E and 1F and S3 Table ) and 29% in females ( Fig 1G and 1H and S3 Table ). Importantly, the differences in microbiome composition persisted across TPs, although the amount of variation explained by RF dropped to 20.4% in males and 17% in females. This implies that a large part of the initial differences in the microbiomes of mice from different RFs persisted throughout the 6 weeks in the test laboratory, although they did converge to some extent once they were housed together at the same test facility. When further investigating how overall community composition varied across RFs, we found a clear separation along principal coordinate axis 1 (PCoA1) between RFs 3 and 5 and RFs 1, 2, and 4 in both males (both TPs) and females (TP1). Clustering analysis suggested that the type of mouse diet (specifically diet supplier; S1A Table ) was driving the grouping of the mice into these 2 populations ( S2A Fig and S1 Data ). Interestingly, these 2 populations differed in abundance of Firmicutes and Bacteroidetes ( S2B Fig ), 2 phyla that are associated with numerous phenotypic differences in health and disease in animal and human studies [ 41 – 43 ]. Overall, these findings show that the rearing environment can lead to substantial and temporally persistent compositional differences in the gut microbiome community, which, in turn, may drive phenotypic variation [ 38 ]. We first analysed whether the gut microbiome of mice from different RFs differed in α-diversity measures. In males, effects of the RF were significant for both predicted and observed taxa richness, but there was little effect on overall diversity and evenness ( Fig 1C and S2 Table ). At TP2, we observed an increase in all metrics of α-diversity, except for observed species richness ( Fig 1C and S2 Table ). Similar patterns in terms of differences between RFs for taxa richness were observed in females; however, there was no change in α-diversity metrics across TPs ( Fig 1D and S2 Table ). Taken together, these results suggest that the macroenvironment of the rRF can lead to significant differences in the richness of the gut microbiome community. The gut microbiome has been reported to play an important role in shaping the host phenotype [ 35 – 39 ]. Therefore, we first examined the extent to which the composition of the gut microbiota varied in response to the macroenvironments of the different RFs, and whether these differences persisted after the transfer to the common macroenvironment of the test laboratory. In conclusion, our findings could help to explain replicability issues in animal research [ 2 , 97 ]. Poor replicability has mostly been attributed to publication bias, lack of statistical power, analytical flexibility, and other risks of bias [ 3 , 7 , 98 – 100 ], albeit empirical evidence has remained elusive [ 101 ]. In contrast, the large between-study heterogeneity caused by rigorous within-study standardisation has long been ignored as a cause of poor replicability, despite both theoretical and empirical evidence [ 13 , 14 , 25 , 31 , 83 , 102 – 106 ]. Our findings highlight an important limitation of inferences from single-laboratory studies and suggest that the (early) environmental background of animals—just like their genetic background [ 107 – 109 ]—should be accounted for by study design to produce more robust and replicable research findings. Thus, results from standardised single-laboratory studies might only be considered as preliminary evidence. Whether using animals from multiple breeding or RFs provides an effective solution to systematically heterogenise the environmental background of study populations remains to be tested. However, we hope that our findings will stimulate research to find other, perhaps more practicable ways to produce robust and replicable research results. Our findings raise important questions about the mechanism underlying the observed phenotypic differences that cannot be answered based on the present data but need to be tested independently in follow-up studies. For example, given the observed association between diet and microbiome composition, the question arises whether such differences would also occur if different diets were fed in the same facility, and whether diet might also explain some of the variation in epigenetic and behavioural profiles. Several studies have shown that gut microbiota can affect brain development and behavioral functions [ 92 , 93 ]. These effects might be mediated by gut-microbiota products that can cross the blood brain barrier and modulate the chromatin landscape in neurons, which, in turn, can influence behavior [ 94 – 96 ]. Controlled experiments with different diets, fecal transplants, and treatment with antibiotics would be needed to elucidate these potential causal mechanisms further. Further, our findings suggest that environmental differences between RFs influence neuronal developmental patterns by modulating gene regulatory networks involved in the regulation of hippocampal synaptic plasticity and neurogenesis. Such effects on the chromatin profile of functionally relevant genes may be responsible for persistent changes in behavioural traits [ 89 , 90 ]. However, we also found that some of the pathways affected by the rearing environment maintained plasticity, possibly to facilitate adaptation to environmental change, as shown by chromatin reorganisation in response to the transfer to the test laboratory at 8 weeks of age. This result deserves further investigation as chromatin plasticity is considered to provide a molecular mechanism for adaptive plasticity under different environmental conditions, as shown for example in Drosophila melanogaster [ 91 ]. We also confirmed and extended previous findings demonstrating that variation in microbiome composition is strongly determined by the rearing site [ 37 , 85 – 88 ]. Previous studies found that although vendor-specific differences in the gut microbiome of C57BL/6J mice may decrease over time at a new site, they persisted throughout studies lasting for 12 weeks [ 37 ] and 24 weeks [ 27 ], respectively. However, these mice had been housed in individually ventilated cages (IVCs) or microisolater cages, preventing contamination between cages. Our results extend these findings by showing that such animal facility-dependent differences in microbiome composition persisted in the testing facility, even though our mice were housed together in 1 housing room in conventional open cages. In this study, we found that common differences in standard housing and husbandry practices between animal facilities modulated morphological, physiological, and behavioural traits in a genetically homogeneous (inbred) cohort of C57BL/6J mice from a single breeding colony, thereby producing facility-specific phenotypes. Phenotypic plasticity is ubiquitous in nature, and studies under controlled laboratory conditions have long shown that many aspects of housing and husbandry conditions can alter the phenotype of mice throughout ontogeny [ 76 – 83 ]. In fact, this is the main reason why textbooks of laboratory animal science recommend environmental standardisation to minimise variation in experimental results [ 22 ]. However, previous studies have shown that such standardised experiments characterizing the phenotype of mice may produce results that are idiosyncratic to the laboratory where the study was conducted [ 44 , 84 ]. Our findings corroborate and extend these findings. In previous multicentre studies [ 44 , 84 ], the effect of the rearing environment was confounded with the specific test conditions at the test centre. Thus, in these studies, phenotypic differences between animals tested in different facilities may reflect differences in the state of the animals while being tested. In contrast, in the present study, phenotypic differences between mice from different RFs must reflect differences in the underlying phenotypic traits. This is guaranteed by the fact that all mice were derived from a single breeding colony (minimizing the risk of genetic differences between mice from different RFs), that they were all tested in the same test laboratory under the same test conditions by the same experimenters, and that test order was counterbalanced for RF. Methods Study preregistration and ethical statement Before data acquisition started, the study protocol was preregistered under the DOI number: 10.17590/asr.0000201 at the Animal Study Registry, operated by the German Centre for the Protection of Laboratory Animals (Bf3R) at the German Federal Institute for Risk Assessment (BfR). All animal experiments were conducted in full compliance with the Swiss Animal Welfare Ordinance (TSchV 455.1) and were approved by the Cantonal Veterinary Office in Bern, Switzerland (permit number: BE12/19). RFs located in Hannover and Münster did not require separate approval by governmental authorities for an experimental animal study, because animals were only housed in those laboratories, while all experimental procedures were carried out in Bern under the abovementioned license. Animal subjects and study design Nine-week-old, time-mated, primiparous pregnant C57BL/6JRj females in the last third of pregnancy (gestational day 14 or 15), all derived from the same breeding stock and colony room of a commercial breeder (Janvier Labs, Le Genest-Saint-Isle, France), were randomly allocated to 5 different rearing animal facilities (n = 18 per facility). The RFs were located at the following institutions: (i) Institute of Laboratory Animal Science, Hannover Medical School, Germany (RF 1 and RF2); (ii) Division of Animal Welfare, Vetsuisse Faculty, University of Bern, Switzerland (RF 3); (iii) Department of Behavioural Biology, University of Münster, Germany (RF 4); and (iv) Institute of Anatomy, University of Zürich, Switzerland (RF 5). Pregnant dams were singly housed for approximately 5 days, from arrival at the RF until parturition. Dams were monitored daily for parturition and day of birth was defined as postnatal day 0 (PND 0). Litters were not culled during the lactation period and all healthy pups were weaned at PND 22 according to predefined weaning criteria (S6 Fig). At weaning, in each RF from the 18 possible litters, all litters that contained at least 3 male and/or female pups were used to randomly select 12 groups of 3 littermates for each sex. If fewer than 12 groups with at least 3 littermates of a sex were available, these were complemented by litters with at least 2 pups of that sex. This strategy resulted in a total of 326 weaned mice. Litter size, litter sex ratio, as well as the final number of weaned males and females from each litter are presented in S11 Table. From each litter, 3 (or 2) pups per sex were selected randomly and reared together until the age of 8 weeks (PND 56) according to the specific protocols of housing and husbandry of each of the 5 RFs (e.g., type of cages, handling method, bedding, nesting material, diet, light regime). Detailed housing and husbandry conditions are reported in S1A Table. The first 8 weeks of postnatal life were selected because they cover 2 sensitive developmental periods (i.e., early life and adolescence) in mice [110]. These periods represent a critical stage of brain [111–115], HPA axis [48,116], and gut microbiome development [39,117] when environmental inputs may shape the later-life phenotype of mice at different levels of organisation. The effect of rearing environment was evaluated at 2 TPs (Figs 2B and S1). At TP1, 1 mouse per sex and cage of all cages with 3 mice was weighed and killed within each of the 5 RFs and brain and cecal samples were harvested for epigenomic and microbiome analyses. To ensure that observed differences are attributed only to the differential rearing environments, we standardised the tissue collection procedure. All mice were killed at the beginning of the light cycle (within the first 3 hours after the lights were turned on) by the same person who was not in contact with those animals before and who was using the same equipment in each of the 5 RFs. Blinding with regard to RF was not possible for weighing and organ collection since the experimenter needed to travel to each RF. At PND 58, the remaining pairs of male and female offspring (n = 240; 24 mice per sex per RF) were transferred from the 5 RFs to the testing laboratory, located at the Vetsuisse Faculty of the University of Bern (testing laboratory). The testing lab in Bern was separate from the RF in Bern. To ensure that all animals experienced approximately the same treatment during transport to the testing facility, the driver of the car transporting the animals from the RF in Bern to the adjacent test laboratory in Bern was asked to take a 2-hour detour. Special efforts were made to reduce any possible transport-induced stress. The same-sex cage mates were placed into 4 compartment transport boxes (Type 500005L; Bio Services BV, Uden, the Netherlands), with 2 mice per compartment and shipped by a professional company using an environmentally controlled vehicle. Each compartment contained 1 cm of bedding, nesting material from the home cage, food pellets, and hydrogel. Upon arrival, animals were checked for health and pair housed in freshly bedded Type 3 cages (floor area 820 cm2) and habituated to the new animal facility for 2.5 weeks. Each Type 3 cage contained 3 cm of bedding (OSafe Premium Bedding, SAFE FS 14, Safe-Lab, Rosenberg, Germany), a red mouse house (Tecniplast, Indulab, Gams, Switzerland), a medium-size cardboard tunnel (Play tunnel, #CPTUN00016P, Plexx B.V., the Netherlands), and 10 g of nesting material (Sizzle Nest #SIZNEST00016P, Plexx B.V., the Netherlands). Standard rodent chow (Kliba Nafag #3430, Switzerland) and tap water were available ad libitum. Females and males were housed in separate rooms, and all animals were kept on a 12:12 light/dark cycle, with lights on at 12:00 h. Detailed housing and husbandry conditions in the testing facility are reported in S1B Table. The day after arrival in the testing facility, animals were individually marked by ear tattoo, after which cages were assigned new identification numbers, and positions of cages within and between cage racks were randomly reshuffled as part of the blinding procedure. Further information on blinding is available as a supporting text. Behavioural and physiological phenotyping commenced after an acclimatisation period of 2.5 weeks. We focused on phenotypic traits of exploration, emotionality, and stress reactivity that are known to be sensitive to environmental variation during early ontogeny [48,112,118,119]. Two common tests for exploration and emotionality, the OF test and the LDB test, were conducted in that order, with a break of 7 days in between, followed by an SRT after another break of 7 days. For these tests, 1 mouse per cage, sex, and RF (n = 120) was used. After 1 additional week (around 14.5 weeks of age; PND 102; TP2), all mice (n = 240) were killed for postmortem analyses. Body weights were also recorded for each mouse during cage changes throughout the acclimatisation period and prior to killing. To avoid possible influences of the circadian rhythm on behaviour, corticosterone secretion, and molecular readouts, all procedures were performed during the light phase (from 12:00 h to 16:30 h). Tissue sampling procedure All animals were killed by cervical dislocation. Whole brains were immediately removed and quickly frozen in a hexane bath on dry ice before being stored at −80°C. The brain region of interest, i.e., ventral hippocampus, was dissected later for subsequent molecular analyses. Adrenal glands were removed, dissected from fat, and weighed using a precision scale (Mettler AE160, Mettler-Toledo, Switzerland). The mouse cecum together with its content was isolated, snap frozen in liquid nitrogen, and kept at −80°C for microbial DNA extraction and sequencing. Gut microbiota composition analysis For gut microbiota analysis, samples of mouse caeca were taken at both TPs, at each RF, and at the end of the experiment at the testing laboratory. Six cages per sex and RF were randomly selected from all cages containing 3 littermates after weaning (n = 180). One mouse from each selected cage was killed in the RF at 8 weeks of age (TP1; PND 56; n = 60), while the other 2 cage mates (n = 120; one underwent behavioural testing and one remained naïve) were killed in the testing laboratory at TP2 (14.5 weeks of age; PND 102). Microbial DNA was extracted from cecal content using the Allprep DNA/RNA mini kit (Qiagen, Hilden, Germany) according to the manufacturer’s instruction. DNA was eluted and concentrations were assessed using the QuantiFluor dsDNA system (Promega). The 16S rRNA V4 region was amplified using the following primers 515f-Y: GTGYCAGCMGCCGCGGTA and 806r-N: GGACTACNVGGGTWTCTAAT and The Q5 high-fidelity DNA polymerase kit (New England BioLabs, UK). In a total volume of 25 μL, 2 μl of extracted DNA was added to a PCR reaction mix prepared by mixing a final concentration of 1X Q5 reaction buffer, 200 μM of dNTPs, 0.5 μM of each primer, 0.02 U/μL of Q5 5 High-Fidelity DNA Polymerase, and 1X of Q5 High GC enhancer. The first PCR reaction was carried out using the following conditions: (i) first denaturation: 95°C for 30 s; (ii) denaturation in each PCR cycle: 98°C for 10 s; (iii) annealing: 56°C for 30 s; (iv) extension: 72°C for 30 s; (v) final extension at the end of the reaction: 72°C for 2 min, followed by a hold step at 4°C. The cycles 2 to 4 were repeated 8 times. The PCR products were purified using CleanNA CleanNGS purification beads (CNGS0050; LabGene Scientific SA), resuspended in 15 μl of EB buffer, and served as templates in the second PCR reaction. In the second PCR step, unique dual index barcodes of length 2 × 8 nt were added to each sample, which allowed equimolar pooling of samples after quantification of the target product using the Agilent fragment analyzer (Agilent). In total, the final library pool contained 176 samples, 3 bacterial mock communities, and 20 DNA extraction blanks. The finished library pool was sequenced using the NovaSeq 6000 platform (Illumina, USA) in a single lane of SP flow cell at the Functional Genomics Center Zürich. The raw sequencing data were analysed using the DADA2 pipeline (version 1.14 16), and individual reads were grouped into amplicon sequence variants (ASVs). The final table contained 1,560 ASVs. After removal of the blank and mock samples from the data, the individual library sizes ranged from 47,910 to 1,616,753, with a median of 843,361 (S6A Fig). To mitigate the effect of variation in library size across samples, we performed random down-sampling of reads within each sample to an even library size across samples. Given the minimum read count in the data, counts were rarefied to a depth of 47,000 reads per sample (S6B Fig). We calculated both α diversity (within sample diversity) and β diversity (between-sample diversity) in order to assess the effect of RF on microbial community composition within and between mice. For α diversity, the following metrics were calculated: (i) observed species richness, which represents the total number of species counted within a sample; (ii) Chao1 richness, for estimation of the “true” species diversity, which is calculated using the following formula: , where S obs stands for the observed number of species, and F 1 and F 2 stand for the number of species with 1 or 2 observed reads, respectively; (iii) Shannon diversity index, which illustrates the diversity within a sample, taking both richness and evenness into account. It is calculated using the following formula: , where S represents the total number of species and p represents the proportion (n/N) of individuals of 1 particular species found (n), divided by the total number of individuals found (N); and (iv) Pielou evenness for estimation of how similar in numbers each species in a sample is. It is calculated using the formula , where Ht′ is the Shannon index and S is the total number of species in a sample. ANOVA was used to test the effect of RF and TP on α-diversity measures. To analyse changes in microbiota composition between mice reared at different TPs or RFs, the Bray–Curtis dissimilarity between samples was calculated using the formula , where i and j are the 2 samples, S i is the total number of species counted in sample i, S j is the total number of species counted in sample site j, and C ij is the sum of only the lesser counts for each species found in both samples. Principal coordinate analysis (PCoA) was used to visualise the similarity in microbiome composition between samples and to retrieve sample loadings onto the first 3 PCoA axes. A permutational analysis of variance (PERMANOVA) was used to assess the proportion of variation in microbiome composition explained by RF and age (i.e., TP). Based on our study design, there were 3 groups of cecal samples: (i) samples collected at TP1 (within each of RFs); (ii) samples collected at TP2 from mice that underwent behavioural testing; and (iii) samples collected at TP2 from mice that did not undergo behavioural testing and that were used for chromatin profiling. To balance the study design (there were double number of samples collected at TP2 comparing to the number of samples collected at TP1), we assessed whether the 2 sets of samples (from behaviourally tested BT mice and mice used for molecular analysis, i.e., chromatin profiling MA mice) collected at TP2 differ in terms of species richness, diversity, and overall community composition. Our analysis showed that there were no statistically significant differences in either α diversity (assessed using Wilcoxon signed-rank test) or β diversity (assessed using PERMANOVA) in any of the tested parameters between BT and MA mice (S7 Fig). Therefore, only MA mice were considered for the final analyses. Differential abundance analysis was performed by using an adjusted p-value threshold of 0.05 and a log2-fold change threshold of 1. All analyses related to the gut microbiome were done in R (version 4.1.0) using the libraries vegan (2.5–7) for down-sampling, PERMANOVA, and calculation of observed/chao1 species richness, microbiome (1.14.0) for calculation of Shannon diversity and Pielou evenness, ampvis2 (2.7.2) for PCoA ordination, stats (4.1.0) for ANOVA calculations and hierarchical clustering, fpc (2.2–9) for definition of cluster number, factoextra (1.0.7) and ggdendro (0.1.22) for dendrogram creation, DESeq2 (1.32.0) for differential abundance analysis, and ggplot2 (2.2.1) and patchwork (1.1.1) for visualisation. Testing for behavioural and physiological responses The set of behavioural outcomes belongs to the confirmatory part of the study, and appropriate sample size was determined a priori by a power analysis using simulated sampling for a two-way ANOVA design. The power analysis was done for the main outcome variable, plasma corticosterone levels in the SRT. Based on historical data [120,121], we expected to observe an effect of medium size (i.e., means estimates for 2 randomly chosen RFs are expected to be in the range of 20%, equivalent to a ratio of between-facility: within-facility variation of 1:2). This resulted in a required minimal sample size of 12 mice per RF and sex. Further information on the sample size calculation is available in the supporting information file (S8 Fig). The same sample size (n = 12 per RF and sex) was also used for behavioural testing. The order of behavioural test trials for all mice was randomised using the random number generator of the Mathematica software (version 11; Wolfram Research, Champaign, Illinois, USA). Forty mice (20 males and 20 females) were randomly assigned to each of 3 experimenters. Mice were handled by the same experimenter during the habituation period, cage change, and behavioural testing. Animals were tested in parallel (at the same time, but in separate apparatuses) by 2 experimenters, with 3 different combinations of 2 experimenters each day. Testing was carried out in batches during 4 consecutive days. Males were tested on the first and third day, while females were tested on the second and fourth day. Each day’s testing was done by each experimenter in 3 blocks of 5 animals, each. The randomisation and allocation procedures were restricted so that, in each block for each experimenter, there was exactly 1 mouse from each RF in random order, with the addition that in no case animals tested at the same time were from the same RF. Open field test The OF test was performed in a polycarbonate box (45 × 45 × 45 cm; illumination set to 120 lux) with grey walls and a white base plate. Each mouse was placed in the left (close to the experimenter) corner, facing the wall, and allowed to freely explore the open field for 10 min. Recording started immediately after placing the animal in the box. The behaviour of the first 5 min of the test was analysed. The behaviour was video recorded using an infrared camera system, and mice were automatically tracked from videos using EthoVision XT software (version 11.5; Noldus, Wageningen, the Netherlands). The space was virtually divided into a centre zone (20 × 20 cm) and an outer zone. The total distance traveled, the average velocity, the number of entries into the centre area, the time spent in the centre, and the latency to the first centre entry were scored. Light–dark box test The LDB test was conducted in a box (37.5 × 21.5 × 15 cm) consisting of a small, closed dark compartment (12.5 × 21.5 × 15 cm; illumination set to 5 lux) and a larger light compartment (25 × 21.5 × 15 cm; illumination set to 200 lux) connected by a sliding door. Each mouse was placed in the dark compartment of the apparatus and testing began after the delay of 5 s as the sliding door to the light side of the box was raised, and the duration of the test was 10 min. The behaviour of the first 5 min of the test was analysed. The total distance traveled, the average velocity in the light compartment, the time spent in the light compartment, the number of entries into the light compartment, and the latency to enter the light compartment were measured from video recordings using EthoVision XT software (version 11.5; Noldus, Wageningen, the Netherlands). Stress reactivity test The SRT was performed according to established protocols [122] with slight modifications. In brief, each mouse was taken out of its home cage and a first blood sample was collected by incision of the ventral tail vessel with a scalpel blade (Paragondisposable sterile scalpels No. 10, Paragon Medical, Lausanne, Switzerland). The procedure was limited to 2 min to obtain basal levels of corticosterone unaffected by the sampling procedure. Immediately after blood collection, the mouse was restrained for 20 min in a 50-ml plastic conical tube (11.5 cm × 2.5 cm; Fisherbrand Easy Reader, Fisher Scientific AG, Reinach, Switzerland) with custom-made holes for breathing and for the tail. At the end of the 20-min restraint period, a second blood sample was taken from a fresh incision rostral to the first one, followed by placing the mouse back in its home cage. A third blood sample was taken from a third incision rostral to the second one, 90 min after the onset of restraint. Each blood sample was collected by using a dipotassium-EDTA capillary blood collection system (Microvette CB 300 K2E, Sarstedt, Nümbrecht, Germany). Immediately after sampling, the blood samples were placed on ice. Within 60 min, the samples were centrifuged for 10 min at 4,000g and 4°C. Plasma samples were transferred to new, labeled microcentrifuge tubes and stored at −80°C until assayed. Plasma concentrations of corticosterone were determined by a commercial ELISA kit (EIA 4164, DRG Instruments GmbH, Marburg, Germany) in duplicates according to the manufacturers’ instructions. Intra- and inter-assay coefficients of variation were below 10% and 12%, respectively. Oestrous cycle determination The oestrous cycle stage was assessed by cytological analysis of vaginal smears to estimate the sex hormone status of the female mice. Vaginal smears were taken immediately after testing and postmortem after killing. Briefly, after each test, the female was placed on the cage lid with its hind end towards the experimenter. The rounded tip of a disposable pipette with 50 μl of sterile distilled water was gently placed at the opening of the vaginal canal, and vaginal smear cells were collected by lavage. Smears were placed on microscopic slides, allowed to dry, stained with 0.1% crystal violet solution, washed, and then analysed using light microscopy. The stage of the oestrous cycle was determined based on the relative ratio of nucleated epithelial cells, cornified squamous epithelial cells, and leukocytes. Since there were uneven distributions of oestrous cycle stages across groups on any given testing day [123], the vaginal smears data were combined into high-oestrogen state (proestrus, dioestrus-proestrus transition, and proestrus-oestrus transition) and low-oestrogen state (dioestrus, oestrus, metoestrus, oestrus-metoestrus transition, and metoestrus/dioestrus transition) and were included in the analysis as a linear binary factor. Statistical analysis of behavioural and physiological responses All statistical analyses were performed using the statistical software R (version 3.6.2). A detailed R script is available as a supporting file. Data of male and female animals were analysed separately. Statistical tests and models employed for each analysis together with information on fixed and random factors are reported in S4–S9 Tables. In brief, linear models without interaction terms and with identity link function were run for the body weight data collected at TP1 (i.e., right before killing within each RF; n = 6 mice per rearing lab and sex; sample size was limited by the minimal number of cages with 3 mice per sex and RF) and for plasma corticosterone levels measured in the SRT (n = 12 mice per RF and sex). Linear mixed models without interaction terms and with identity link function were run for body weight data and for relative adrenal weights (n = 24 per RF and sex) collected at TP2. Satterthwaite approximation was used for determination of p-values in the mixed models. A Dunn–Sidak Bonferroni correction method was applied to correct for multiple testing where necessary. For the physiological measures, such as body weight and plasma corticosterone levels, the threshold was set to α′ = 1 − (1 − 0.05)1/3 = 0.0169. The distribution of the observed values for behavioural outcomes was inspected for deviations from normality with Q–Q plots. The data set of the physiological measures (body weights, relative adrenal weights, and corticosterone responses in the SRT) were normally distributed (S10–S12 Figs). Due to skewed distributions, transformations of behavioural data were necessary for 7 variables (S12 Fig). In the male cohort, OF latency, LDB latency, and LDB time in the light of males were square root transformed (S13A Fig). In the female cohort, OF distance was log transformed, OF time in the centre, OF latency, and LDB time in the light were square root transformed, while LDB latency was arcsine transformed (S13B Fig). For the analysis of behavioural outcomes, we used a MANOVA without nesting and interaction terms and with identity link function. To check for correlation between recorded variables, we calculated Pearson’s product moment correlation coefficient. We first excluded variables that were highly correlated (S12 Table), which resulted in the final list of 6 dependent behavioural variables (OF distance, OF time centre, OF latency, LDB time light, LDB entries into light, and LDB latency). Pillai’s Trace was used to evaluate the MANOVA differences, while the robustness of the findings was confirmed by 3 other test statistics: the Wilk’s Lambda, Hotelling–Lawley Trace, and Roy’s Largest Root. Multivariate outliers were identified by using the squared Mahalanobis distance (mvoutlier.CoDA version 2.0.9; [124]). The analysis suggested the existence of 6 outliers in the male cohort and 6 in the female cohort; S14 Fig), which were not removed for the further analysis. To investigate whether the outliers had an impact on the results, we rerun the analysis with the outliers removed and could confirm that results were not markedly different. [END] --- [1] Url: https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3001837 Published and (C) by PLOS One Content appears here under this condition or license: Creative Commons - Attribution BY 4.0. via Magical.Fish Gopher News Feeds: gopher://magical.fish/1/feeds/news/plosone/