(C) PLOS One This story was originally published by PLOS One and is unaltered. . . . . . . . . . . High sorbic acid resistance of Penicillium roqueforti is mediated by the SORBUS gene cluster [1] ['Maarten Punt', 'Tifn', 'Wageningen', 'The Netherlands', 'Microbiology', 'Department Of Biology', 'Utrecht University', 'Utrecht', 'Sjoerd J. Seekles', 'Department Molecular Microbiology'] Date: 2022-08 Sorbic acid resistance was further analyzed by determining the MIC u values of sorbic acid for the 34 P. roqueforti strains. The four strains with the highest sorbic acid resistance on MEA ( Fig 1 ), were also among the strains (DTO006G1, DTO006G7, DTO012A1, DTO012A8, DTO013E5 and DTO013F2) that showed the highest MIC u ( Fig 2 ). Although strains DTO012A1 and DTO012A8 did not exhibit increased colony growth on agar ( Fig 1 ), they were among the resistant strains. This might be explained by the limited growth period (5 days) or their overall relative low growth rate compared to the other resistant strains (See controls Fig 1 ; DTO006G1, DTO006G7, DTO013E5 and DTO013F2). DTO013E5 and DTO013F2 were the most resistant and even showed growth at the highest tested undissociated sorbate concentration of 21.2 mM, indicating a MIC u > 21.2 mM. The other strains showed a distinctly lower MIC u , ranging between 4.2 mM and 9.95 mM. Only strain DTO012A9 showed an intermediate resistance to sorbic acid with an average MIC u of 13.72 mM undissociated sorbic acid. Average colony size (cm 2 ) of 34 P. roqueforti strains after five days of growth on MEA (pH 4.0) (grey) or MEA (pH 4.0) supplemented with 5 mM propionic acid (orange), sorbic acid (blue) or benzoic acid (green). Strains DTO006G1, DTO006G7, DTO013E5 and DTO013F2 (indicated in bold) are relatively resistant to sorbic acid. Error bars indicate standard deviation of biologically independent replicates. Weak-acid sensitivity of 34 P. roqueforti wild-type strains was assessed on MEA plates supplemented with 5 mM propionic, sorbic or benzoic acid, which corresponds to 4.42, 4.25 and 3.07 mM undissociated acid, respectively. Three strains had been isolated from blue-veined cheeses such as Roquefort and the other 31 strains had been isolated from non-cheese environments (mostly related to spoiled food) ( S1 Table ). The colony surface area was determined after five days of growth ( Fig 1 ). The inhibitory effect of propionic acid at the tested concentration was limited for most strains, since 26 out of 34 strains grew to > 80% of the colony surface area reached under control conditions. Strains DTO012A1 and DTO012A8 even showed an increased colony size (up to 120%) when compared to the control. The inhibitory effects of sorbic and benzoic acid were more pronounced. MEA supplemented with potassium sorbate reduced colony area for all 34 P. roqueforti strains. The surface area under sorbic acid stress ranged from 0 to 80% of the surface area reached under control conditions. Strains DTO006G1, DTO006G7, DTO013E5 and DTO013F2 showed the highest sorbic acid resistance, followed by DTO046C5. Benzoic acid was the most inhibitory compound, resulting in a maximum colony surface area between 0 and 20% of the control. The most benzoic acid-resistant strains were DTO013F2 and DTO013E5. As these strains were also among the most sorbic acid-resistant strains, similar resistance mechanisms may be involved to cope with benzoic and sorbic acid stress. The number of scaffolds, assembly length, GC content and genes of the 34 sequenced P. roqueforti strains are listed. In addition, the number of genes with a PFAM domain, the number of secondary metabolism gene clusters and the BUSCO completeness are listed. The type column indicates if the strain is sorbic acid-resistant (R) or sorbic acid-sensitive (S). Phylogenetic tree of the 34 P. roqueforti strains used in this study. Sorbic acid resistance (MIC u ) is indicated in blue-yellow shading. The tree is based on 6923 single-copy orthologous genes and was constructed using RAxML. P. rubens [ 44 ] was used as outgroup. Bootstrap values <100 are indicated. Strains containing the SORBUS cluster are highlighted in blue. The genomes of the 34 P. roqueforti strains were sequenced. Scaffold count varied between 45 and 1358, assembly length between 26.53 and 31.74 Mb and GC content between 46.85 and 48.44% ( Table 1 ). The number of predicted genes varied between 9633 and 10644, the number of genes with PFAM domains between 73.11 and 75.38%, and the number of secondary metabolism gene clusters between 32 and 36. All strains had a BUSCO completeness of >99%, indicating high quality assembly and gene predictions, except for DTO012A8 with a completeness of 94.83%. This and the high scaffold count of the DTO012A8 assembly (1358 scaffolds) indicates that its genome assembly is not complete. The strain was kept in the downstream analysis as most other metrics did not differ much compared to the other strains ( Table 1 ). Fig 3 presents a phylogenetic tree of the 34 P. roqueforti strains based on 6923 single-copy orthologous genes. Three of the six sorbic acid-resistant strains (DTO012A2, DTO013F2, DTO013E5 and DTO012A8) are similar with DTO012A8 being closely related to those strains. The two other resistant strains are also similar and more distantly related. Sorbic acid resistance correlates with a genomic cluster containing genes regulating sorbic acid decarboxylation Two complementary genome-wide association study (GWAS) approaches were taken to identify the genetic elements (e.g., genomic regions, genes, SNPs, etc) associated with sorbic acid resistance. The first approach is based on the presence of large genomic regions that are only found in the sorbic acid resistant strains, and the second approach is based on SNPs that are significantly correlated with the level of sorbic acid resistance. In the first GWAS approach, the 34 P. roqueforti strains were divided into two groups based on their sorbic acid resistance, a group of six resistant (R-type) strains (‘a’, Fig 2) and a group of 28 sensitive (S-type) strains (‘b-d’, Fig 2). The assemblies of all strains were aligned to the assembly of DTO006G7 using MUMmer, and subsequently genomic regions (and genes encoded on those regions) were identified that were unique to the R-type strains. With this method 57 genes were identified, of which 51 were present on scaffold 43 of DTO006G7 (Table 2), and the six genes outside of scaffold 43 are g6241, g8103, g9940, g9945 and g9946. In addition to the 51 unique genes in this scaffold, it contains 19 genes which are also found completely or in part in some of the S-type strains. The genomic alignment shows the genes on scaffold 43, which is present in the R-type strains (Fig 4). The first 80 kbp of scaffold 43 (protein IDs g12000-g12029) mainly contains hypothetical proteins without predicted function, while the remaining region between 80–180 kbp (g12030–g12069) contains multiple regions homologous to genes previously reported as related to weak-acid resistance in A. niger. Predicted genes orthologous (based on bidirectional best BLAST hit) to padA, cdcA and sdrA of A. niger were found alongside each other (g12064-g12066) with respective identities of 87%, 83% and 53%. Moreover, two padA paralogs with high BLAST similarities to padA of A. niger (63% and 58%) were identified on the R-type-specific cluster and named padB (g12032) and padC (g12057). Similarly, two paralogs of cdcA, named cdcB (g12056) and cdcC (g12040), were identified on the same gene cluster as well, with identities of 72% and 71%, respectively, when compared to cdcA of A. niger. An additional cdcA paralog, cdcD (g2591), is not located on this cluster and is also present in S-type strains. In contrast, no homologs of A. niger sdrA and padA were found outside of the cluster. In addition, a transcription factor (g2820 in DTO006G7) orthologous (based on a bidirectional best BLAST hit) to warA of A. niger was identified outside of the cluster in the genomes of all strains. These results indicate that the R-type strains contain a gene cluster similar to the sorbic acid resistance gene cluster described in A. niger, but considerably expanded [9,10]. For further reference, we name this cluster (i.e., scaffold 43 of strain DTO006G7) SORBUS after the tree Sorbus aucuparia, as sorbic acid has been first isolated from its berries by August Hoffman [19,20]. While SORBUS as a whole is only present in the R-type strains (DTO006G1, DTO006G7, DTO012A2, DTO012A8, DTO013F2, DTO013E5), some S-type strains share up to 5 kbp parts of the sequence, especially in the first 80 kbp of the cluster (Fig 4). The conservation of SORBUS in the R-type strains was visualized with Clinker, demonstrating that the genes of the SORBUS cluster are well conserved and highly syntenic in the R-type strains (S1 Fig). It should be noted that the R-type strains were all isolated from non-cheese environments. Based on alignments of the sequencing reads to DTO006G7 we confirmed that out of 35 previously sequenced P. roqueforti strains [18], none of the 17 cheese strains contained the SORBUS cluster, while two out of the 18 non-cheese strains contained the SORBUS cluster (S1 Table). PPT PowerPoint slide PNG larger image TIFF original image Download: Fig 4. Genome comparison reveals unique gene cluster in R-type strains. Genomic alignment of 33 P. roqueforti strains to the SORBUS cluster (scaffold 43 of DTO006G7). Predicted genes are indicated with arrows, repetitive DNA is indicated in red, the sequence read coverage is indicated in the green tracks, and blue bars indicate (partial) overlap with DTO006G7. The complete SORBUS cluster is only present in the R-type strains (DTO006G1, DTO006G7, DTO012A2, DTO012A8, DTO013F2 and DTO013E5). https://doi.org/10.1371/journal.pgen.1010086.g004 PPT PowerPoint slide PNG larger image TIFF original image Download: Table 2. Genes located on the SORBUS cluster in strain DTO006G7. Fold change (log 2 FC) of the sorbic acid samples compared to the control is given. Underlined genes are significantly differentially expressed (adjusted p-value < 0.05) and the mean expression (FPKM) of three biological replicates is given per condition (control and sorbic acid). Rows highlighted in grey indicate genes that are not unique for the R-type strains. https://doi.org/10.1371/journal.pgen.1010086.t002 In the second GWAS approach, PLINK [21] was used to identify which SNPs correlate with sorbic acid resistance. This method allowed for the quantitative use of log 10 (MIC u ) values as input for analysis, as opposed to the first approach described above. The correlation between the presence of SNPs and the sorbic acid resistance is visualized in a Manhattan plot (Fig 5). SNPs located on genes with a -log 10 (P) > 5 and either a high or moderate impact (SNPeff) were selected (Table 3). This resulted in 338 SNPs in 41 genes. Out of these SNPs, 29 had a ‘high’ impact according to SNPeff and were located in 17 genes. Only six out of these 17 genes with high impact variants (g7017, g8100, g8106, g9942, g9943, g9976) were not located in the SORBUS cluster, the other 11 genes were either among the non-unique genes present on SORBUS, or genes of which less than 90% of the sequence was found in S-type strains. Functional annotation revealed that protein g8100 contains an ankyrin-repeat domain, while protein g9943 is homologous to a zinc finger C3H1-type domain-containing protein. In contrast, the four remaining proteins had no functional annotations. In all cases, these six high impact variant-containing genes showed high similarity (> 99%) to genes in other Penicillium species. SNPs in two additional genes encoding a putative transmembrane transporter (g216) and cation transporter (g296) also correlated with increased resistance to sorbic acid. PPT PowerPoint slide PNG larger image TIFF original image Download: Fig 5. Manhattan plot shows SNPs associated with sorbic acid resistance. Scaffolds are listed on the x-axis, while the y-axis display the significance of the association (−log 10 (p-value)). Yellow, orange and red dots indicate ‘low’, ‘moderate’ or ‘high’ impact SNPs as determined by SNPeff, respectively. The GeneIDs associated with the SNPs with a −log 10 (p-value) > 7.5 are indicated. The SORBUS cluster is located between the dashed lines. https://doi.org/10.1371/journal.pgen.1010086.g005 To investigate the evolutionary origin of the SORBUS cluster, the presence of five PFAM domains (from g12060, g12061 and g12063-g12065) that are present on the SORBUS cluster was analysed in 32 Aspergilli and Penicillia as well as the 34 P. roqueforti strains (Table 4). These PFAM domains encode a putative GPR1/FUN34/yaaH family (g12060), a flavin reductase like domain (g12061), 3, 4-dihydroxy-2-butanone 4-phosphate synthase (g12063), a flavoprotein (g12064, padA) and a UbiD domain (g12065, cdcA). These domains are selected as they are clustered and because of their predicted role. The first domain has been associated with acetic acid sensitivity in S. cerevisiae [22], while the latter four domains are part of the gene cluster described in A. niger [9]. The SORBUS genes g12061, g12064 and g12065 are more similar to genes of several Aspergillus species, as determined by a gene tree approach(Table 4), whereas the PFAM domains from g12060 and g12063 did not cluster with any of the species included. In addition, the PFAMs present in the core genome (present both in S- and R-type strains) showed higher similarities to those of P. digitatum and P. oxalicum. PPT PowerPoint slide PNG larger image TIFF original image Download: Table 4. Number of genes containing PFAM domains corresponding to g12060, g12061 and g12063-g12065 (PF01184, PF01613, PF00926, PF02441, PF01977) based on phylogenetic trees constructed with their respective PFAM domains. Top six strains are R-type P. roqueforti strains containing the SORBUS cluster. C (CORE) indicates if the domains aligned closely to the PFAMs not unique for the SORBUS cluster or did not align to P. roqueforti, S (SORBUS) indicates the number of PFAM domains which aligned closely to PFAMs originated from SORBUS. https://doi.org/10.1371/journal.pgen.1010086.t004 Furthermore, two genes with homology to transposase-like proteins (g12052 and g12055) and several reverse-transcriptase domains and other transposon-related domains were identified in encoded proteins on the SORBUS cluster (Table 2). An analysis of the repetitive content of the genomic DNA revealed that the SORBUS cluster was more repetitive than the average of the genome (6.4% and 3.8%, respectively). In particular, long interspersed nuclear elements (LINEs) of the Tad1 retrotransposon family were enriched on SORBUS, as well as unknown/unclassified repetitive elements (S2 Table). [END] --- [1] Url: https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1010086 Published and (C) by PLOS One Content appears here under this condition or license: Creative Commons - Attribution BY 4.0. via Magical.Fish Gopher News Feeds: gopher://magical.fish/1/feeds/news/plosone/