(C) PLOS One This story was originally published by PLOS One and is unaltered. . . . . . . . . . . Murine and related chapparvoviruses are nephro-tropic and produce novel accessory proteins in infected kidneys [1] ['Quintin Lee', 'Centenary Institute', 'Faculty Of Medicine', 'Health', 'The University Of Sydney', 'Sydney', 'Nsw', 'Matthew P. Padula', 'Proteomics Core Facility', 'University Of Technology Sydney'] Date: 2022-12 Mouse kidney parvovirus (MKPV) is a member of the provisional genus Chapparvovirus that causes renal disease in immune-compromised mice, with a disease course reminiscent of polyomavirus-associated nephropathy in immune-suppressed kidney transplant patients. Here we map four major MKPV transcripts, created by alternative splicing, to a common initiator region, and use mass spectrometry to identify “p10” and “p15” as novel chapparvovirus accessory proteins produced in MKPV-infected kidneys. p15 and the splicing-dependent putative accessory protein NS2 are conserved in all near-complete amniote chapparvovirus genomes currently available (from mammals, birds and a reptile). In contrast, p10 may be encoded only by viruses with >60% amino acid identity to MKPV. We show that MKPV is kidney-tropic and that the bat chapparvovirus DrPV-1 and a non-human primate chapparvovirus, CKPV, are also found in the kidneys of their hosts. We propose, therefore, that many mammal chapparvoviruses are likely to be nephrotropic. Parvoviruses are small, genetically simple single-strand DNA viruses that remain viable outside their hosts for very long periods of time. They cause disease in several domesticated species and in humans. Mouse kidney parvovirus (MKPV) is a causative agent of kidney failure in immune-compromised mice and is the only member of the provisional Chapparvovirus genus for which the complete genome including telomeres is known. Here, we show that MKPV propagates almost exclusively in the kidneys of mice infected naturally, wherein it produces novel accessory proteins whose coding regions are conserved in amniote-associated chapparvovirus sequences. We assemble a closely related complete viral genome present in DNA extracted from the kidney of a wild Cebus imitator monkey, and show that another related chapparvovirus is preferentially found in kidneys of the vampire bat Desmodus rotundus. We conclude that many mammal-hosted chapparvovirus are adapted to the kidney niche and may therefore cause disease following kidney stress in multiple species. Funding: Supported by the Australian National Health and Medical Research Council (W.W., B.R., P.B. & J.J.-L.W), the Cancer Institute NSW (B.R. & J.J.-L.W), the Hillcrest Foundation (C.J.J.), the Alfred P. Sloan Foundation (S.H.W.), the National Institutes of Health (U19AI109761 Center for Research in Diagnostics and Discovery, S.H.W.), the National Cancer Institute Cancer Center Support Grant P30 CA008748 (S.M.), the Fundação de Amparo à Pesquisa do Estado de São Paulo, Brazil (No. 17/13981-0 and 18/09383-3, W.M.S, & M.J.F.), the National Sciences and Engineering Research Council of Canada (A.D.M), the Canada Research Chairs program (A.D.M.), the Alberta Children’s Hospital Research Institute (A.D.M. & J.D.O.) and the Beatriu de Pinós postdoctoral programme of the Government of Catalonia's Secretariat for Universities and Research of the Ministry of Economy and Knowledge (J.D.O.). IDEXX BioAnalytics funded the portion of the reported work performed in their laboratories. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. MKPV and MuCPV are only distantly related to other known murine parvoviruses and are members of the provisional genus Chapparvovirus; so-called because the earliest examples, discovered by metagenomic analyses, were found in chiropteran, avian and porcine hosts (i.e. bats, birds and pigs) [ 11 – 13 ]. Recently, additional amniote chapparvovirus (ChPV) sequences were discovered by screening of draft genome assemblies and presumed to reflect parvoviral infection of the source animal rather than viral genome integration [ 14 ]. The growing list of potential hosts now includes marsupials and fish; furthermore, ChPV-derived endogenous viral elements (EVEs) were discovered in some invertebrate genomes [ 15 – 17 ]. This discovery demonstrated that ChPVs are an ancient lineage within the family Parvoviridae, and phylogenetic analysis suggested that ChPV form a parvoviral subfamily distinct from the two currently established parvoviral subfamilies Parvovirinae and Densovirinae [ 16 ]. Curiously, extant fish-associated ChPV are more related to ancient invertebrate ChPV-derived EVEs than to extant amniote-associated ChPVs [ 16 ]. Vertical transmission of parvoviruses across the placenta can kill developing embryos or newborns in domesticated species such as dogs and pigs [ 5 , 6 ], but many parvoviruses are highly adapted to infecting specific cell types. For instance, Erythroparvovirus B19 infects red blood cell precursors in humans, potentially inducing anaemia [ 7 ], and even though AAV2 can transduce many cell types, it is naturally liver-adapted and targets the liver if intravenously injected [ 8 ]. Horizontal transmission of the newly-identified mouse kidney parvovirus (MKPV) induces adult renal failure in severely immune-deficient laboratory mice, without obvious pathology in other tissues [ 9 ]. Co-incidentally, a virus very similar to MKPV was identified in mice living wild in New York City (NYC), with greater incidence in adults than juveniles, and dubbed murine chapparvovirus (MuCPV), but the state of kidney disease was not assessed in that study [ 10 ]. The rep plus cap sequence of MuCPV, lacking TRs, was originally assembled from the faecal virome of house mice living wild in New York City (NYC; accession MF175078) [ 10 ]. Independently, a full-length 4,442 nt sequence of MKPV, including TRs, was assembled from the kidney transcriptomes of two renal disease-affected immune-deficient Rag1 -/- mice in the colony of the Centenary Institute, Sydney, Australia (CI; accession MH670587), and a 3.5 kb fragment of MKPV encompassing NS1 and VP was then amplified by PCR from the kidneys of immune-deficient mice necropsied at Memorial Sloan Kettering Cancer Center, NYC (MSKCC; accession MH670588) [ 9 ]. The MuCPV and MKPV genomes are 98% identical to one another at the nucleotide level; thus, they belong to the same species according to ICTV guidelines. Parvoviruses are small, non-enveloped, polyhedral, single-strand DNA viruses with genomes 4–6kb in length which bear short (120–600 base) terminal repeats (TRs) that form hairpin telomeres. All parvoviral genomes comprise two major genes encoding a non-structural replication protein NS1 (gene rep) and a capsid protein VP (gene cap). Alternative splicing or alternative translation initiation sites can allow the production of truncated forms of VP; all sharing the same C-terminal region [ 1 , 2 ]. Open reading frames (ORFs) usually overlapping the NS1 or VP reading frames encode smaller genus-specific accessory proteins. Parvoviruses can only replicate when the host cell itself replicates. Furthermore, many members of the Dependoparvovirus genus (e.g. adeno-associated virus, AAV) can only replicate if a helper virus is also present [ 1 , 3 ], but this is not a universal feature of Dependoparvovirus–close avian relatives of AAV that cause Derzsy’s disease in geese and Muscovy ducks replicate autonomously [ 4 ]. Encouraged by ORF conservation between MKPV and CKPV, we searched all near-complete amniote-associated chapparvovirus genomes for conserved ORFs using “Genie” software [ 21 ] to identify likely splice donor and acceptor sites for these ORFs. In all cases, we found a p15-like ORF (recently identified independently as ORF-1 [ 16 ]) and a 2-exon NS2-like ORF ( Fig 6C and S5 Fig ). Furthermore, consensus U2-dependent splice donor and acceptor sites were predicted to produce transcripts similar to MKPV transcripts 1–4 in nearly all cases ( Fig 6C and S5 Fig ). The exceptions (annotated as “NF” for “not found” in Fig 6C ) were that the software did not predict splice acceptor sites upstream of the P. mucrosquamatus or M. unicolor ChPV VP regions, nor for M. unicolor ChPV NS2 exon 2. Nonetheless, manual alignment of reading frames implies a functional acceptor site for the M. unicolor virus NS2 exon 2 and “acceptor-like” sites a short distance upstream of VP in the P. mucrosquamatus. and M. unicolor ChPV (see S5 Fig ). p15 was significantly less conserved across amniote chapparvoviruses compared to VP and NS2 polypeptides ( Fig 6B ). On the other hand, VP and NS2 were significantly more conserved than NS1 ( Fig 6B ). The CKPV genome is strikingly similar to MKPV ( Fig 6 ) and encodes proteins homologous to MKPV p15 and p10; the VP, NS1, p15 and p10 proteins of CKPV are 77%, 71%, 76% and 55% identical to their MKPV counterparts, respectively ( Fig 6B and S5A Fig ). Furthermore, the U2-dependent splice donor and acceptor sites used in MKPV for expression of VP and to encode NS2 [ 9 ] are conserved in CKPV ( Fig 6C and S5B Fig ). We predict an NS2 protein in CKPV that is a remarkable 84% identical to MKPV’s NS2 protein ( Fig 6B ). In contrast, the non-spliced variant of NS2 (i.e. NP) is less conserved because CKPV’s NP ORF is much shorter than MKPV’s ( Fig 6C ). Finally, the hairpin structures with the lowest Gibbs-free energy predicted for minus strand CKPV TRs are strikingly similar to the structures predicted for minus strand MKPV TRs ( Fig 6A ). (A) The lowest Gibb’s energy structures predicted for the minus-strand TR regions of MKPV (MH670587) and CKPV (MN265364), with the left TR shown above the right TR. Sp1-binding sequences are boxed in blue. Grey shading in the CKPV TRs indicates sequence recovered from a solitary read, with the 3’-end of the solitary read indicated by “*”. (B) The percentage identity at the amino acid level between MKPV ORFs and the corresponding ORFs from nine other near-complete amniote chapparvoviral genomes currently available (MUSCLE alignment [ 35 ]). Tukey’s box and whiskers are used. Significant differences in relative ORF conservation (non-parametric Friedman test) are indicated as in the Fig 2 legend. Colours indicate tissue source(s) of the virus sequences: blue–kidney or urine, orange–faeces, yellow–lungs or respiratory tract, green–liver, pink–muscle. (C) Maps (same colours and symbols as in Fig 1B ) of the near-complete amniote ChPV genomes analysed. Splice sites (or putative splice sites) are detailed in S5B Fig . Ragged ends indicate incomplete ORFs that continue beyond currently available sequence. Genomes sources are: [ 11 , 14 , 15 , 36 – 39 ]. Partial ChPV genomes lacking 5’- or 3’-coding sequences have been assembled from numerous vertebrate species, including from the draft genome of the capuchin monkey (Cebus capucinus imitator) [ 14 ]. The draft capuchin genome was assembled using DNA extracted from the kidney [ 19 ]. We used the genome of MKPV as a scaffold to re-arrange two sequence fragments present in Cebus imitator scaffold NW_016109986 into a near-complete ChPV genome, but lacking TRs and with a probable gap in NS1 ( S4 Fig ). Using SAMtools [ 20 ], we mapped high quality reads in the complete capuchin kidney NGS dataset [ 19 ] to this draft viral genome, which in-filled a 5 nt gap in the NS1 sequence compared to scaffold NW_016109986.1. By recovering sequences from “soft”-clipped reads at the 5’- and 3’-ends of this new alignment ( S4 Fig ), we produced a complete ChPV genome with TR lengths longer than MKPV. It should be noted that the first 33 bases and the last 29 bases of the CKPV genome were recovered from a single read each ( Fig 6A ), but because these reads progressed through unique hairpin regions in the telomeres (the ends of these single reads are indicated by “*” in Fig 6A ) they were aligned to the genome extremities non-erroneously. Due to its high level of identity to MKPV (see next paragraph), we dubbed this complete viral genome “capuchin kidney parvovirus” (CKPV, Accession MN265364). The kidney DNA sample used to assemble the CKPV genome (and the draft Cebus imitator genome) was extracted in a biological safety cabinet inside a BSL-2 laboratory facility with CL3 protocols to minimise contamination with foreign DNA, but it is no longer available, so we cannot strictly rule out the possibility that CKPV is an extraneous contaminant, nor can we confirm the CKPV genome by PCR or Sanger sequencing. To estimate the prevalence of MKPV in research mice, mouse faecal samples that were submitted to IDEXX BioAnalytics over a seven-month period and representing 78 biomedical research institutions were tested for MKPV by qPCR. Overall prevalence was 5.1%, with 178 positive samples out of 3,517 samples tested. Immune status is unknown for most of the samples. Of those samples designated as representing immunodeficient mice, 16 were positive out of 171 tested (9.4%), and for samples designated as representing immunocompetent mice, 56 out of 513 (10.9%) tested positive for MKPV. It should be noted that many of the faecal samples are likely to represent soiled-bedding sentinel mice, and MKPV prevalence among sentinel mice may differ from colony animals (which may be immunocompetent or immunodeficient) based on a variety of factors including efficiency of transmission by soiled bedding. Data are insufficient to construct a phylogeny. The provenance of each sequence is indicated by text at left and by flags at right, with red text indicating accessions, from top to bottom, MH670587, MH670588, MF175078 and MG679365; for IDEXX BioAnalytics pathology samples, donor institutions are identified by an anonymizing code unique to each institution and by a geographical region where known–each in brackets. Shading over the text indicates infection of a genetically immune-competent strain (blue = laboratory mouse, orange = wild-caught mouse). The coloured bar at the top indicates the consensus sequence (yellow = A, green = G, blue = C, red = T). SNVs varying from the consensus are presented as in Fig 1A , with colour-coding to indicate the non-consensus base. Amino acid changes (five in total) are indicated by “XnnnX”. Primer pairs were designed to amplify four regions of concentrated polymorphisms in the NS1 and VP ORFs. Only one of these four primer pairs– 934 plus 935 ( Fig 1A )–was able to amplify MKPV sequences from historic FFPE samples reliably and this pair was therefore selected for sequencing in a larger sample set. The 934–935 region was amplified from kidney FFPE-specimens or from randomly-selected faecal samples sent to Idexx Laboratories (Columbia, Missouri) from multiple laboratory facilities (in the USA, Canada, Europe and Israel) or from previously-described wild mouse samples from NYC basements [ 10 ], then Sanger sequenced from both ends. SNVs were collated in a 267 bp window (the blue box in Fig 1A ). Clustering analysis, which incorporated a partial MKPV sequence from Mus musculus living wild in Xinjiang, China (accession MG679365), identified 22 MKPV sub-strains, varying by 3–22 SNVs from the consensus sequence; the Xinjiang sequence being the most divergent ( Fig 5 ). The MSK-WCM colonies provided the largest set of time-shifted samples for the same location, from 2007 to present. There was no clear evidence of one strain replacing another over time in the MSK-WCM colonies. Instead, more than one strain was present in the MSK-WCM colonies at most timepoints, and two sub-strains present in MSK-WCM in 2008–2009 and 2015–2017 were identical (within our SNV window) to sub-strains from the University of North Carolina (2018) and Johns Hopkins University (2006), respectively. All of the wild NYC samples shared some SNVs with laboratory strains, mostly located in the same continental region. It was notable that the wild NYC sample Q-055 shared four SNVs with Australian laboratory mice ( Fig 5 ), and the wild NYC sample M-118 shared three SNVs with lab specimens from Europe and Israel ( Fig 5 ). This is consistent with MKPV being carried within immune-deficient lab mice when they were live-exported from the USA to labs outside the Americas, but does not prove it. Virtually all SNVs in the 934–935 window were synonymous, with just four exceptions ( Fig 5 ): Glu187Asp, Glu187Gln, Thr118Ile and Ala123Ser. None of these mutations are likely to affect NS1’s tripartite helicase domain. Alignment of the original CI-MKPV, MSKCC-MKPV and wild NYC MuCPV sequences revealed numerous single nucleotide variations (SNVs) and a two-base insertion in the small intron of CI-MKPV; Sanger and Illumina sequencing data also revealed a few SNVs within each virus strain ( Fig 1A ). Some of these SNVs were non-synonymous and the resulting changes in amino acid sequences are shown in S2A Fig . Another notable SNV was the insertion of an extra “C” in the right TR of a sub-strain present in one CI mouse, converting the sequence C 4 G 4 to C 5 G 4 in the interior repeat (“▲” in Fig 1A ), without the insertion of a complementary base in the exterior inverted repeat. This SNV creates an extra 1 nt bubble in the structure predicted to be formed by the right TR ( S2B Fig ), but whether it results in viable virus remains to be determined. MKPV/MuCPV was reported in five sites previously: in the wild in NYC, USA and in laboratory mice housed in NYC and Baltimore in the USA, and in laboratory mice from Sydney plus another Australian city [ 9 , 10 ]. We screened two additional sets of necropsy specimens from laboratory mice with histologically diagnosed IBN by PCR and detected MKPV DNA in laboratory mice housed at University of North Carolina (Chapel Hill, USA) and in Israel ( Fig 4 ). The specimen from Israel was also probed using ISH and abundant MKPV nucleic acids localised to tubular epithelial cells were detected ( Fig 4 ). This increases the number of sites in which MKPV is associated with mouse kidney disease to six sites in three continents. To examine a greater range of tissues, we deployed an MKPV-specific in situ hybridization (ISH; RNAscope) probe [ 9 ] ( Fig 1A ) in tissue sections from necropsies of two MKPV +ve NOD-scid IL2Rgamma null (NSG) mice. These two mice had histopathologic evidence of chronic inclusion body nephropathy (IBN) and ISH had detected abundant MKPV nucleic acids in tubular epithelial cells [ 9 ], as reproduced here ( Fig 2Bi ). No pathologic change attributed to the virus was observed outside the kidneys on H&E-stained sections. Mild multifocal ISH staining was also observed in the caecum mucosal epithelium and lamina propria of one mouse, but not the other, and in the urinary bladder urothelium (mostly umbrella cells) of both mice ( Fig 2Bii-iii , arrows). In addition, there were strongly positive cells in the urinary bladder lumen of both mice, which were presumably casts of necrotic tubular cells sloughed from the kidney ( Fig 2BCiii , asterisks). No ISH signal for MKPV was detected in the liver or any of the 20 other tissues screened ( Fig 2Biv-vi ; S5 Table ). Another thirteen tissues sampled during necropsy were not probed because the decalcification process used in their preparation for H&E-staining was incompatible with ISH (see S5 Table ). (A) Relative abundance by qPCR of MKPV genomes (left) or MKPV cap mRNA (right) in organs of naturally-infected Rag1 –/– mice, using primers 869–870 or 947–948, respectively (Tukey’s box and whisker plots; n = 8). MKPV DNA is presented as viral genome copies. cap mRNA abundance is indicated by Ct relative to RT-qPCR for mouse Hprt mRNA. ND = not detected. Significance is indicated by asterisks (*, P<0.05; **, P<0.01, ***, P<0.001, ****, P<0.0001; ns, p>0.05; 1-way paired ANOVA with Tukey’s multiple comparisons test). (B) ISH for MKPV nucleic acids in necropsy specimens from NSG mouse 16–1653 housed in MSK-WCM in 2016. Scale bar = 25 μm. Arrows in panels (ii-iii) indicate mild multi-focal staining in caecum and urinary bladder; asterisks in panel (iii) indicate casts of necrotic tubular cells sloughed from the kidney into the urinary bladder lumen. Full details of ISH outcomes are listed in S5 Table . During the natural course of infection, MKPV DNA was detected first in the kidney of young adult mice, then appeared in liver, spleen and blood as infection progressed [ 9 ]. The mapping of MKPV RNA splicing ( Fig 1 ) enabled quantitation of MKPV infection via qPCR for spliced MKPV RNA in different tissue sites. We extracted DNA and DNA-free RNA from liver, spleen (a proxy for blood) and kidneys of naturally MKPV-infected Rag1 –/– mice, then performed qPCR using DNA or cDNA templates. For DNA, we used NS1 primers 869 and 870, as previously reported [ 9 ] ( Fig 1A ). For cDNA, we used primers 947 and 948 ( Fig 1A ) and a short extension time, which ensured that product formed from spliced transcript 4 and not from MKPV DNA (see Fig 1B ). Consistent with our previous study [ 9 ], MKPV DNA was much more abundant in kidney than in liver or spleen ( Fig 2A ). Notably, spliced MKPV RNA was below the detection threshold in liver and spleen, but readily detectable in kidneys ( Fig 2A ). Because MKPV mRNA was undetectable outside the kidney, we report the difference between Ct for Hprt versus MKPV transcripts in Fig 2A , rather than use a spliced cDNA standard curve (unprocessed Ct values are plotted in S3 Fig ). In theory, p10 is encoded in all major transcripts that start from TSS1, but not from transcripts starting at TSS2. NS1 could be translated from transcripts 1, 2a or 2b ( Fig 1B ). The ATG start codon of the p15 ORF abuts the splice acceptor site of transcripts 2a and 2b in Fig 1A and 1B . Therefore, p15 could also be produced from transcripts 1, 2a or 2b. In addition, these transcripts potentially encode NP, a hypothetical ORF in Type 2 ChPV, but sometimes lacking a conventional start ATG codon in Type 1 ChPV [ 16 , 18 ]. Transcript 3 encodes a two-exon variant of NP that we previously dubbed NS2 [ 9 ]; however, we have not detected any peptides by LC-MS/MS that confirm production of NS2 or NP in vivo. Transcript 4 shares the splice donor of transcript 3 and encodes the capsid protein VP ( Fig 1B and 1C , S4 Table ). Minor variants of transcripts 3 and 4 that used the immediately upstream splice donor at nt 433 were detected at about 9 to 15-fold reduced frequencies ( Fig 1C , S4 Table ). Major and minor VP-encoding transcripts summed to account for 44–56% of all MKPV transcripts in infected kidneys ( Fig 1F , S4 Table ), which is consistent with capsid protein comprising the bulk of infective parvoviral particles. PolyA signal B is necessarily used by copies of transcript 4 that produce VP protein, and we presume that polyadenylation signal A is used by most copies of transcripts 1–3, but it is possible that all transcripts use a mix of both A and B polyA signals ( Fig 1B ). Other minor transcription start or polyadenylation sites were indicated by capillary electrophoresis of RACE products, but their yields were too low to be Sanger sequenced ( Fig 1F ). For instance, a faint 5’ RACE product was detected that might correspond to transcript 1 (“1” in Fig 1F ), because it was about 80 bp larger than the 5’ RACE product corresponding to transcript 2A. To precisely map the 5’-ends of the major MKPV transcripts, we deployed rapid amplification of cDNA ends (RACE) following SMARTer full-length cDNA synthesis ( S1A Fig ). Sanger-sequencing of the major 5’ RACE products ( Fig 1F and S1B Fig ) confirmed that they corresponded to transcripts 2–4 –as labelled in Fig 1F . Two transcription start sites, TSS1 and TSS2, were mapped for transcript 2 ( Fig 1A and 1F ). TSS1 corresponds to nt 147 with “smearing” to nt 144–146 (transcript 2a), while TSS2 corresponds to nt 267 with “smearing” to nt 266 (transcript 2b). The yield of 5’ RACE products ( Fig 1F ) suggested that transcripts 2a and 2b accumulate to roughly equal proportions. The transcription starts for transcripts 3 to 4 mapped to precisely the same nucleotides as transcript 2a –i.e. TSS1 ( Fig 1B ) but not TSS2. All of these results were consistent with the RT-PCR reactions in Fig 1E . Since the interior repeat of MKPV’s left TR immediately abuts TSS1 ( Fig 1A ), transcription predominantly initiates from very near the 3’-end of the left TR. Previous qualitative comparison of Illumina reads with the confirmed MKPV genome indicated the presence of three major MKPV introns (see accession MH670587). We quantified MKPV splicing in the RNAseq data (GSE117710) from two independent MKPV-infected kidneys ( S4 Table ). This confirmed that the two donor sites and three acceptor sites used by the above-mentioned introns accounted for 96% of all detectable MKPV splicing events, with an additional five donor and two acceptor sites accounting for >99% of the remaining splice events ( S4 Table , Fig 1C ). To directly confirm the major splicing events, we extracted DNA or RNA from kidneys of MKPV +ve Rag1 –/– mice and exposed the RNA to DNase I plus Exo I to destroy MKPV DNA. After reverse transcription (“+RT”) or mock reverse transcription (“-RT”), we amplified MKPV cDNA using antisense primers 902, 905 or 933 paired with primers 890, 955, 904 or 947, which are mapped in Fig 1A and 1B . The sense primers “walked” from the hairpin of the left TR (primer 890) to just upstream of the most 5’ major splice donor site (primer 947). Agarose gel analysis of PCR products ( S1 Table and Fig 1D and 1E ) confirmed that the RNA template was free of MKPV DNA and detected an MKPV transcript that did not use any of the major splice sites (transcript 1) plus the spliced MKPV transcripts 2–4, illustrated in Fig 1B , as the dominant transcripts. The splicing indicated in Fig 1B for transcripts 2–4 was confirmed by Sanger sequencing; similarly, Sanger sequencing confirmed that transcript 1 contained the 88 nt intron intact. Counting of spliced reads or reads mapped to intron sequence uniquely retained in transcript 1 in the RNAseq data (GSE117710) indicated that transcript 1 accounted for 7–13% of MKPV transcripts ( S4 Table ). This is likely to be a slight over-estimate because MKPV DNA was a trace contaminant in the source RNA [ 9 ]. Furthermore, our analysis does not exclude the possibility that a small minority of transcript 1 mRNAs might splice using splice sites 3’ to primer 902 (i.e. minor donors at nt 667, 2189 and 2588 spliced to the acceptor at 2775 –see S4 Table ). All viable parvoviruses encode NS1 and VP, and production of these proteins in MKPV-infected tissue was confirmed previously by liquid chromatography-tandem mass spectrometry (LC-MS/MS) [ 9 ]. However, both MKPV and the extended MuCPV sequence have potential to produce several other polypeptides from ORFs >25 aa in length. We performed a new independent LC-MS/MS analysis of an MKPV-infected kidney and an uninfected kidney, focusing on novel MKPV accessory proteins (dataset PXD014938). In addition, we mined our previous LC-MS/MS datasets (PXD010540) [ 9 ] for trypsin-derived peptides predicted by these ORFs. These independent analyses re-identified NS1 and VP, as expected. They also identified twelve peptides (with E-values <0.001) covering 65% of a 14.7 kDa polypeptide “p15” ( Fig 1B and S2 and S3 Tables). p15-derived peptides were more abundant in the infected LC-MS/MS data than peptides derived from NS1 or VP ( S2 Table ). Two peptides covering the C-terminal 16% of a 9.8 kDa polypeptide, “p10”, situated immediately downstream of the left TR were also detected (see Fig 1B and S2 and S3 Tables). No other MKPV-derived peptides were detected in infected kidneys and no MKPV-derived peptides were detectable in extracts from uninfected control kidneys. (A-B) Maps of the MKPV/MuCPV strains from Centenary Institute (CI, accession MH670587), Memorial Sloan Kettering Cancer Center (MSKCC, accession MH670588) and New York City basements (wild-NY, MF175078). “Bowties” indicate terminal repeats (TR). (A) Single nucleotide variations (SNV) between the CI, MSKCC and wild-NY accessions. Vertical lines—differences between accessions. Half height vertical lines—polymorphisms within an accession. ▼; 2 bp insertion in the CI strain. ▲; 1 bp insertion in a CI sub-strain. Dashed lines—missing extremities in MSKCC and wild-NY accessions, which consist of the exterior inverted repeats in the full-length CI sequence. (B-C) Alternative splicing allows production of the polypeptides p10, p15, NS1, NS2, NP and VP. Black, brown or blue shading indicate the relative reading frames of ORFs. p15, p10 and NP could theoretically be produced from multiple transcripts. Orange or red indicate peptides present in LC-MS/MS datasets PXD014938 (this paper) or PXD010540 [ 9 ], respectively. Exon or intron sequences flanking splice sites are shown in black or red text, respectively. (C) Quantitation of spliced MKPV reads in RNAseq data pooled from two MKPV-infected kidneys. Columns indicate splice site usage (left y-axis); heights of arcs (right y-axis) indicate the abundance of specific splice combinations. See S4 Table for more information. (D-E) Detection of spliced transcripts by RT-dependent PCR, using primers mapped in A-B. Input templates were MKPV-infected (D) kidney DNA or (E) DNAse/ExoI-treated kidney RNA, converted (+RT) or mock-converted (-RT) to cDNA. RT-PCR products corresponding to transcripts 1 to 4 are indicated by white numbers. (F) Mapping of transcription start and stop sites by RACE. See S1 Fig for RACE details. Major 5’ and 3’ RACE products, indicated by black arrows and corresponding to transcripts 2 to 4 or polyadenylation signals A and B, were gel-purified and Sanger sequenced. Other RACE products mentioned in the text are indicated by white arrows. Discussion Our data add to the association between MKPV and chronic kidney disease in immune-deficient laboratory mice and demonstrate that MKPV is distributed worldwide. MKPV can be detected in immune-sufficient laboratory mice [9] as well as in wild-living mice in the USA [10] and China, which indicates that MKPV does not require immune-deficiency to propagate and that mice are a natural MKPV host. The relatively low level of SNV diversity in MKPV samples from Australian laboratory mice spanning a decade (Fig 5) suggests that a single laboratory MKPV strain was imported into Australia and has been transmitted horizontally in laboratory mice since, with little or no re-infection from wild mouse sources. In contrast, infection of mouse colonies in the USA by virus from wild mouse pools appears to have occurred repeatedly, but these infections may have occurred prior to the establishment of modern barrier facilities. The MKPV-infected Australian colonies we first reported [9] were all descended from the Rag1tm1Bal strain [22]–imported into Australia in about 1994. Fig 5 indicates that this strain is the original source of MKPV in Australian laboratory mice, because all MKPV+ve Australian lab mice seem to carry the same virus strain. Shared SNVs between the Australian MKPV strain and wild mouse sample Q-055 (Fig 5) suggest, but do not prove, that founder infection occurred in the USA. Australian MKPV-free Rag1–/–mice descend from the Rag1tm1Mom strain [23] supplied by The Jackson Laboratory (Bar Harbor, Maine, USA). Mus musculus is not native to Australia and is thought to have arrived onboard European ships 230 to 420 years ago [24]. Given that other parvoviruses are prevalent in feral house mice in Australia [25] and a virus closely related to MKPV was found in the faeces of a wild scavenger (i.e., a Tasmanian devil, Sarcophilus harrisii) in Tasmania [15], it appears likely that MKPV or a related ChPV will be found in mice living wild in Australia. In contrast to Mus musculus, Pseudomys and other native mice have lived in Australia for four million years or more [26]. Since ChPV2 is only distantly related to the other ChPVs found in S. harrisii faeces [15], we speculate that S. harrisii ChPV2 infects rodents that form part of the diet of S. harrisii, while the other S. harisii-associated ChPV are marsupial-adapted, because they cluster phylogenetically between avian- and mammalian-associated ChPV [15]. Our initial proteomic analysis of proteins produced by MKPV [9] was limited to polypeptides related to previously predicted ChPV ORFs. Our detection of abundant “p15”-derived peptides and a C-terminal "p10”-derived peptide provides the first direct evidence that ChPVs produce accessory proteins, and do so in vivo. p10 has not been predicted by other studies, but p15 corresponds to ChPV ORF-1 which was independently predicted in silico during the course of our analyses [16]. We performed the 5’-RACE experiments to test whether transcription initiation site might influence the production of MKPV polypeptides, but with the exception of transcript 2a, all the major MKPV transcripts initiated from the same TSS at or near nt 147 (Fig 1), so the transcription initiation site is not a major determinant of MKPV polypeptide production. The abundance of p15-derived peptides exceeded that of all other MKPV-derived peptides in infected kidneys (S2 Table), surprisingly suggesting that p15 is a major product of MKPV infection. In contrast, we failed to detect any peptides derived from NS2 or NP. Nonetheless, the conservation of splice sites required for NS2 expression in all near-complete amniote ChPV genomes (S5 Fig) combined with detection of major NS2-encoding transcripts in MKPV-infected cells (Fig 1) suggests that NS2 is likely to be a functional ChPV ORF; although it remains plausible that NP is also expressed. We produced consensus protein sequences for p15, NS2 and p10 using Muscle or T-Coffee [27]. As previously noted [16], p15/ORF-1 universally carries clusters of basic amino acids in the C-terminal region suggestive of nuclear localisation signals (see S5A Fig), but searches of Pfam and SwissProt for proteins carrying motifs similar to p15, p10 or NS2/NP on the HMMER server (see Methods) did not find significant matches that could provide any other clues to function. Nonetheless, the so-far universal presence of p15 and NS2 ORFs (or an NS2-like ORF in Sus scrofa PPV7 and Sarcophilus harrisii ChPV-6 (S5 Fig)) in amniote-associated ChPV indicates that these are likely to be accessory proteins important in the propagation of many or all ChPV. NP and p10 seem to be less conserved. In the case of NP this is due to variation in the position (or absence) of a start ATG codon, although NP might be translated from a non-conventional start codon in some or all instances. In the case of p10, it is due to the absence of any p10-like ORF in most ChPV sequences discovered so far (Fig 6 and S5 Fig). Comparison of complete primate CKPV and MKPV genomes (Fig 6) demonstrates that chapparvovirus ORFs and TR structures are highly conserved and distinct from the genomes of Protoparvovirus, Ambidensovirus, Erythroparvovirus, Bocaparvovirus and Dependoparvovirus (reviewed in [1]). Interestingly, the other primate ChPV sequences known so far–the simian parvo-like viruses and Mafachaviruses from macaques–cluster with Type 1 porcine ChPVs and not with Type 2 ChPV such as CKPV. Type 1 ChPV nonetheless encode p15 and an NS2-like ORF (this paper and [18, 28]; at the time of writing, Mafachavirus sequences were not accessible for inclusion in this paper’s analyses). Precise mapping of the major MKPV transcripts has confirmed the splice donor, splice acceptor and polyadenylation sites originally suggested by RNAseq, quantified usage of major and minor splice sites, and has mapped the dominant transcription start sites to a region immediately 3’ to the left TR. The 5’-ends of MKPV and CKPV are devoid of a consensus TATA-box (i.e. TATAWAW), but each repeat element of the left TR in both MKPV and CKPV encodes an Sp1-binding site. Furthermore, MKPV TSS1 and TSS2 align with the “BBCA+1BW” consensus initiator element (S1D Fig; “A+1” indicates the dominant TSS, highlighted in S1D Fig), for which a single mismatch outside the core CA and smearing from the dominant start nucleotide are well tolerated. Similar initiator elements associated with Sp1-binding sites are found in ~30% of human TATA-less promoters [29, 30]. The dominant five MKPV splice sites we have detected (Fig 1C) can account for all MKPV-derived proteins in infected kidneys and we think it unlikely that the remaining splice sites are required for MKPV propagation. Their presence perhaps provides a means for MKPV to rapidly evolve variant proteins or are vestiges of transcriptional profiles of MKPV’s evolutionary precursors (e.g. usage of the 2588:2775 intron alters the C-terminus of NS1, NS2 and NP–see S4 Table). Production of p10 by MKPV might reflect similar adaptive processes. Detection of MKPV transcripts by qPCR and ISH now demonstrate conclusively that the MKPV strains present in laboratory colonies in Australia and in the USA propagate in kidneys in preference to the liver, intestinal tract and all other soft tissues examined (Fig 2). Furthermore, the absence of any obvious co-occurring virus in MKPV-infected kidneys [9] strongly implies, but does not prove, that MKPV propagates autonomously. We noted previously that MKPV DNA appeared to be more abundant in liver than in other non-kidney sites [9]; furthermore, MuCPV DNA was originally detected in wild mouse livers and anal swabs [10]. Those findings suggested that MKPV/MuCPV might infect liver or the intestinal tract in addition to kidneys. Our current analysis confirmed significantly higher MKPV DNA levels in liver compared to spleen (Fig 2A; adjusted P = 0.0009), but an absence of detectable MKPV RNA in either tissue. This indicates that the liver may act as an MKPV/MuCPV sink or filter during viremia [31], perhaps as a consequence of latent infection, but it is not a site of active MKPV propagation (see Fig 2Biv). The most common source of ChPV sequences to date has been faeces [11–13, 15–18]. Our work shows that MKPV-infection of mouse kidneys leads to the presence of MKPV in faeces via shedding in the urine (Fig 2B) and via ingestion [9]. Thus, in the absence of data from other tissues, detection of a ChPV sequence in faeces (the most convenient and common specimen used for metagenomics) is not evidence that a ChPV propagates in the intestinal tract. The marked kidney-tropism of MKPV combined with an indication of kidney-tropism for DrPV-1, suggests that viruses closely related to MKPV are adapted to kidney niches in distantly-related mammalian hosts; this may include non-human primate hosts because the CKPV genome was extracted from a capuchin kidney. p10 –encoded by both MKPV and CKPV–does not of itself confer kidney-tropism, because bat DrPV-1 virus lacks a p10 ORF (see Fig 6). Based on studies of AAV [8, 32], it is likely that ChPV VP and/or ChPV promoters determine tropism, and we noted that VP is the most conserved of all amniote ChPV proteins (Fig 5). It is therefore conceivable that VP polymorphisms present in MuCPV compared to MKPV might increase tropism for liver (see [16]). While this paper can’t dismiss that possibility, our data can explain the presence of MuCPV DNA in liver specimens without a requirement for liver tropism. MKPV infection in immune-deficient Rag1–/–mice shares clinico-pathological features with polyomavirus-associated nephropathy (PVAN), which is a significant complication in immune-suppressed kidney transplant recipients [9, 33]. Our assembly of the complete CKPV genome from the kidney DNA of a capuchin monkey increases the possibility that a pathogenic ChPV might infect human kidneys. Reasoning that urine from immune-suppressed kidney transplant patients is the most likely material in which human ChPV infection might be detected, we mined the fastq files produced by deep-sequencing the urinary DNA of 27 kidney transplant patients ([34], NCBI accession PRJEB28510), searching for ChPV sequences. However, we found none within the datasets. Indeed, we found no parvoviral sequences of any sort within the datasets, but abundant polyomavirus sequences, as originally reported [34]. This limited sample suggests that human kidney ChPV infection might not be widespread–at least not in the USA, but it might be worthwhile nonetheless to determine the occurrence of antibodies against ChPV antigens in human populations. If anti-ChPV antibodies are uncommon in humans then recombinant parvoviral vectors packaged into ChPV capsid might be better able to evade pre-existing antibody-mediated immunity than AAV vectors presently used in the clinic. [END] --- [1] Url: https://journals.plos.org/plospathogens/article?id=10.1371/journal.ppat.1008262 Published and (C) by PLOS One Content appears here under this condition or license: Creative Commons - Attribution BY 4.0. via Magical.Fish Gopher News Feeds: gopher://magical.fish/1/feeds/news/plosone/