(C) PLOS One This story was originally published by PLOS One and is unaltered. . . . . . . . . . . Emerging biology of noncoding RNAs in malaria parasites [1] ['Karina Simantov', 'Department Of Microbiology', 'Molecular Genetics', 'The Kuvin Center For The Study Of Infectious', 'Tropical Diseases', 'Imric', 'The Hebrew University-Hadassah Medical School', 'Jerusalem', 'Manish Goyal', 'Ron Dzikowski'] Date: 2022-08 lncRNAs (>200 nt) are usually transcribed by RNA polymerase II and were shown to be involved in various cellular processes such as transcription, chromosome remodeling, and protein trafficking [ 54 , 55 ]. lncRNA are classified according to their position into 5 main groups: (I) sense lncRNA; (II) intronic lncRNAs; (III) antisense lncRNAs; (IV) bidirectional lncRNAs; and (V) long intergenic ncRNA (lincRNA) ( Fig 1A–1E ) [ 56 , 57 ]. Since their discovery, extensive studies on the functions of lncRNAs have uncovered their crucial regulatory roles in many organisms. Renowned examples include regulation of X chromosome inactivation in female mammalian cells through the lincRNA Xist [ 58 ], recruitment of histone-modifying enzymes to gene promotors, such as the promotor-associated antisense lncRNA Airn in mice that regulates Igf2r gene silencing [ 59 ], functioning as a scaffold that interacts with multiple proteins or complexes, such as the HOTAIR lncRNA transcribed from the HOXC gene cluster [ 60 ]. Generally, ncRNAs can be divided into 2 main groups: small noncoding RNAs (sncRNAs) and lncRNAs. sncRNAs comprise a wide variety of small RNAs (approximately 18 to 200 nt), including microRNAs (miRNAs), small interfering RNAs (siRNAs), and PIWI (P-element induced wimpy testis)-interacting RNAs (piRNAs) [ 46 ]. siRNA and miRNA are part of the RNA interference (RNAi) process where they regulate the degradation of transcripts mediated by the RNA-induced silencing complex (RISC). They target various transcripts for silencing by guiding RISC to the target mRNA through complementary base-pairing. The target mRNAs are subsequently degraded by the RISC-associated argonaute (AGO) RNase. siRNAs differ from miRNAs by their origin, where siRNA is derived from long double-strand RNA (dsRNA) and miRNA is generated from hairpin-shaped precursors [ 47 , 48 ]. As a result, the level of complementation and subsequent targeting of mRNAs for degradation differs between the 2 sncRNAs. miRNAs have a 5′ seed region that can target multiple mRNAs through partial complementation, whereas siRNAs regulate specific mRNAs that fully complement their sequence [ 46 – 48 ]. Since its discovery, the RNAi machinery is widely used to achieve an inducible and reversible knockdown of a specific gene in eukaryotic cells. miRNAs and components of RISC have been identified in apicomplexan parasites such as T. gondii [ 49 , 50 ], however, it appears that the RNAi pathway is different than other eukaryotes [ 51 ]. In marked contrast, Plasmodium parasites lack most of the components of the RNAi pathway rendering them RNAi-deficient organisms, which could explain the lack of endogenous miRNAs in the parasite [ 52 , 53 ]. Interestingly, the expression of numerous antisense transcripts from multiple sites across the genome was found to vary depending on the parasite stage, the gene locus, the culture conditions, and the parasite line being investigated [ 31 , 32 , 61 , 62 , 64 , 65 , 70 , 71 ]. Moreover, the expression timing of NATs and their neighboring mRNAs could be either inversely or positively correlated [ 62 , 65 ]. It would be interesting to determine whether these variable expression profiles are linked with different cellular functions and regulatory mechanisms. Altogether, these studies indicate that P. falciparum harbors a wide variety of ncRNAs; however, the function of most of its ncRNA is yet to be discovered. With the advance in our understanding of the regulatory nature of ncRNAs in eukaryotes, efforts to identify ncRNAs in Plasmodium spp. have greatly accelerated. However, most studies have been focused on P. falciparum, leaving much to be discovered in other species. The completion of P. falciparum’s genome sequencing and assembly enabled the annotation and further characterization of coding and noncoding elements. Early transcriptomic studies using northern blots, microarray, and serial analysis of gene expression (SAGE) revealed the widespread prevalence of stage-specific antisense transcripts throughout the Plasmodium genome [ 21 , 28 , 31 , 33 , 61 – 65 ]. Additionally, computational predictions and structural analyses of the noncoding genome assisted in the identification of several candidate ncRNAs and novel structured RNAs [ 30 , 31 , 63 , 66 – 68 ]. In recent years with the advances in sequencing technologies that facilitated accurate strand-specific, direct single-molecule pore-sequencing [ 69 ] as well as single-molecule real-time (SMRT) full-length sequencing, it became apparent that P. falciparum transcribes over ≥2,500 full-length lncRNAs, including lincRNAs, natural antisense transcripts (NATs), sense-overlapping and sense-intronic ncRNAs transcripts, as well as over 1,300 circRNAs [ 27 , 29 , 32 , 35 , 69 ] ( Fig 1F and Table 1 ). It is important to note that the observed large number of ncRNAs in Plasmodium, an organism with a compact genome and high gene densities, could be due to overlapping/fused mRNA transcripts and transcriptional byproducts. Therefore, it will be important to authenticate these transcriptomic results with functional assays. Functionally characterized lncRNAs in Plasmodium. Nonetheless, over the last decade, several noncoding transcripts have been characterized and implicated as regulators of key biological processes in the parasite. A prominent example is the lncRNAs involved in the regulation of expression of PfEMP1 (P. falciparum erythrocyte membrane protein 1) [72–74]. These variable surface proteins are the major ligands responsible for P. falciparum’s pathogenicity and its ability to evade human immunity. Immune evasion is achieved in part by cytoadherence of the infected red blood cells (iRBCs) by attachment of PfEMP1 to several endothelial receptors. Consequently, the iRBC is removed from the circulation and avoids clearance by the spleen. Thus, cytoadherence and sequestration of iRBCs are the main cause of tissue damage and the severe pathogenicity during P. falciparum malaria [73,74]. In addition, the parasites have evolved a unique antigenic switching mechanism to avoid the antibody-mediated response against PfEMP1. This is achieved by switches in expression among a multicopy gene family named var, where each var gene encodes for a different PfEMP1 variant and is expressed in a mutually exclusive manner [72–76]. Immune evasion through antigenic switching between different PfEMP1 variants depends on the parasite’s ability to tightly regulate and ensure that only a single var gene is expressed at a time, and to be able to switch the expression to a different var gene as the antibody-mediated response develops. The var gene structure includes a variable exon 1, a conserved intron with a bidirectional promoter, and a conserved exon 2 [77,78]. The bidirectional intron promoter transcribes 2 lncRNAs, which were implicated in regulation of antigenic switching [79–83]. The first is a sense lncRNA that extends into the conserved second exon, and the second is an antisense lncRNA complementary to the 3′ of the first exon (Fig 2A) [73]. These lncRNAs are transcribed by RNA Pol II, undergo capping, but are not polyadenylated [71]. They localize to distinct perinuclear foci in the nucleus and are incorporated into the chromatin [78]. While the sense var lncRNA appears to be transcribed from all the var genes and accumulates during the late stages of the parasite’s development [77,84], the antisense lncRNA is expressed only from the single active var gene at the early stage of the parasite’s intraerythrocytic development (IDC) when var mRNA is transcribed (Fig 2A) [73]. To date, no functional role has been assigned to the var sense lncRNA; however, its accumulation during late stages when var genes are poised for transcription, and the fact that it is transcribed from all the var genes may imply that these transcripts could play a role in silencing, imprinting of the var gene family as a genetic unit for coordinated regulation of mutually exclusive expression, and potentially as regulators of epigenetic memory. On the other hand, the antisense var lncRNAs appeared to play a role in the activation of the single var gene transcribed, though the exact mechanism of action is still not clearly understood [72,78,79,85]. It was found that the expression of specific antisense lncRNAs in trans can activate a silent var gene in a sequence and dose-dependent manner. In addition, interfering with the antisense lncRNA of an active chromosomal var gene leads to the down-regulation in its expression and alters its epigenetic imprint, which results in switching in expression to different var genes [72,86]. Interestingly, the antisense lncRNAs were also associated with the active var gene when an exonuclease termed PfRNAse II was down-regulated [87]. It has also been reported that the disruption of PfRNAse II’s function by fusing it with a destabilization domain led to the dual expression of 2 different var genes, both expressing their respective antisense lncRNAs. This observation led to the hypothesis that PfRNAse II could potentially be involved in the targeted degradation of antisense lncRNAs of a specific var subtype, and thus contribute to their regulation [87]. PPT PowerPoint slide PNG larger image TIFF original image Download: Fig 2. Functionally characterized lncRNAs in P. falciparum. (A) lncRNAs involved in mutually exclusive expression of var genes. Schematic representation of a var gene locus composed of a variable exon 1, bidirectional promoter within the intron, and a conserved exon 2. The upstream var promoter engages in the transcription of the mRNA, while the intronic bidirectional promoter is involved in the production of sense and antisense lncRNA transcripts. The silent var gene transcribes sense ncRNA from the promoter within the intron (left). In the active state of the var gene, RNA polymerase II transcribes both an mRNA in the sense direction and an lncRNA in the antisense direction (right). (B) Intergenic GC-rich ncRNAs transcribed from internal var chromosomal clusters in P. falciparum. These GC-rich ncRNAs are predicted to be transcribed by RNA polymerase III. (C) Schematic representation of P. falciparum telomere and TAREs and subtelomeric gene families. (D) lncRNA’s involvement in sexual commitment. Sexual commitment in Plasmodium is regulated by GDV1, whose expression is antagonistically regulated by its antisense lncRNA. Once expressed, GDV1 reverses HP1-dependent silencing of AP2-G and inhibits sexual commitment. GDV1, gametocyte development 1; HP1, heterochromatin protein 1; lncRNA, long ncRNA; ncRNA, noncoding RNA; TARE, telomere-associated repeat. https://doi.org/10.1371/journal.ppat.1010600.g002 In addition to these var lncRNAs transcribed by each var gene, additional ncRNAs were discovered within intergenic regions near var genes located within internal chromosomal clusters. Interestingly, these ncRNAs have a high GC-content that is uncommon in the P. falciparum genome that is extremely AT-rich (approximately 80%), particularly in the intergenic regions (Fig 2B) [13,34,66,88,89]. The sequence of these GC-rich ncRNAs is relatively conserved and contains elements corresponding with RNA polymerase III promoter, which may imply that the transcription of the GC-rich ncRNAs is mediated by RNA Pol III, unlike other var lncRNAs transcribed by RNA Pol II [34]. Furthermore, it appears that their expression is clonally variable where different clonal parasite populations express different transcripts [89]. However, while 15 GC-rich ncRNAs were found in the 3D7 genome, this parasite line encodes for approximately 60 var genes. Nonetheless, these GC-rich ncRNAs were found to colocalize with the expression sites of subtelomeric and internal var genes and their overexpression was shown to cause de-repression of a subset of var genes [89]. A recent study used a CRISPR interference (CRISPRi) strategy to target the entire GC-rich repertoire by guiding dCas9 to the conserved DNA sequence found in all the transcripts. Down-regulation of the GC-rich transcripts led to a corresponding down-regulation of several multicopy gene families including var, rifin, stevor, and Pfmc-2TM [88], suggesting that the expression of the GC-rich ncRNAs is involved in the transcriptional activity of these gene families. It is still unclear whether these conserved transcripts are involved in the choice for activation of the single var gene expressed, particularly since their sequence, as well as the effect of their down-regulation, appears not to be var specific. It is possible that they play a role in maintaining the chromosomal configuration that positions active genes in subnuclear foci that enable transcriptional activity. Intriguingly, these GC-rich transcripts were also shown to act as repressors in cis, possibly by acting as insulator elements that influence the spread of heterochromatin [34]. It will be important to understand the mechanism that orchestrates this dual, and potentially conflicting activities, of transcriptional silencing and activation of these regulators. Another subclass of lncRNAs identified in P. falciparum are telomeric and subtelomeric associated lncRNAs, transcribed from the telomere-associated repetitive elements (TAREs) during the late stages of IDC (Fig 2C) [27,28,90,91]. The TAREs consist of 6 repetitive blocks (TAREs 1 to 6) that vary in their length and DNA sequence. These repetitive elements are located between the telomere and the coding region of the first subtelomeric genes, where TARE-1 is closest to the telomere end, and TARE-6 is adjacent to the first coding region [90]. The TARE-lncRNAs can be subclassified into 2 main groups. The first is a lncRNA of approximately 4 kb long, derived from the region from TARE-3 to the telomere (TARE-3-lncRNA). The second is a longer transcript over 6 kb long, derived from TARE-6 (TARE-6-lncRNA) (Fig 2C) [28,91]. During the early stages of IDC, these TARE-lncRNAs localize to a single perinuclear compartment (with unknown function), whereas during the late stages they localize to several foci on the nuclear periphery [91]. The sequences of the TARE-lncRNAs appear to be enriched with binding sites for various transcription factors, supporting their possible involvement in the regulation of neighboring genes. They were also postulated to be involved in the maintenance of the structural integrity of the telomere and of chromosome ends [27,28,91]. These hypotheses are supported by the structure of TARE-6-lncRNA that is comprised of 21 bp repeats that form secondary hairpin structures that can bind histones and other nuclear proteins. In addition, TARE-6-lncRNA was implicated in regulating chromosomal conformation and heterochromatin structure during the late stages of IDC [91]. A strand-specific RNA-seq study linked the expression of TARE-lncRNAs with the timing of var sterile lncRNAs, suggesting a possible joined mechanism between these 2 types of lncRNAs [27]. An additional lncRNA transcript assigned a biological function in P. falciparum is the gdv1 lncRNA implicated to be involved in the regulation of sexual development. Sexual commitment in P. falciparum is triggered by the master transcriptional regulator PfAP2-G (Fig 2D) [92]. In asexual blood-stage parasites, Pfap2-g is silenced by heterochromatin protein 1 (PfHP1) [93,94]. However, during sexual commitment, P. falciparum gametocyte development 1 protein (PfGDV1) displaces PfHP1 from the Pfap2-g locus [95]. Removal of PfHP1 from the locus changes chromatin conformation to open euchromatin that facilitates Pfap2-g transcription and sexual conversion. Interestingly, this study found that the Pfgdv1 gene transcribes an antisense lncRNA that negatively regulates the expression of Pfgdv1. Thus, the antisense Pfgdv1 lncRNA functions as an inhibitor of sexual differentiation (Fig 2D) [95]. circRNAs are additional unique forms of lncRNAs that were identified in P. falciparum. Strikingly, hundreds of unknown circRNAs were discovered using advanced strand-specific RNA-seq [27]. Out of the putative circRNAs identified, only a small subset was longer than 200 bp. These indications were validated by divergent PCR across the predicted splice junction on 6 of the long circRNAs [27]. circRNAs are thought to function as competitive inhibitors by binding miRNAs like a sponge, leading to a reduction in the miRNA pool that is available to bind target mRNAs (Figs 1E and 3G) [96,97]. However, since P. falciparum does not encode for miRNAs or the RNAi machinery [52,53], it seems unlikely that these circRNAs encoded by P. falciparum function as miRNA sponges in the parasite. Intriguingly, some of the long circRNAs contain predicted binding sites for human miRNAs, and one can speculate that these circRNAs could be involved in regulatory mechanisms of host–parasite interactions. The presence of human erythrocytic miRNAs was reported in parasites during the IDC, and some of these miRNAs were shown to cause a reduction in parasite’s proliferation in culture, while others were implicated in interfering with var gene expression [98–100]. These findings along with changes in the levels of human miRNAs found in iRBCs [101–103] suggest that some P. falciparum ncRNAs had evolved to interact with human miRNAs to manipulate their function and maintain infection. Taken together, the discovery of such a large repertoire of lncRNAs in P. falciparum highlights the shortcomings in our knowledge regarding the function of these noncoding transcripts in the molecular mechanisms that regulate gene expression which enable the parasite to thrive with such a complex cell cycle. [END] --- [1] Url: https://journals.plos.org/plospathogens/article?id=10.1371/journal.ppat.1010600 Published and (C) by PLOS One Content appears here under this condition or license: Creative Commons - Attribution BY 4.0. via Magical.Fish Gopher News Feeds: gopher://magical.fish/1/feeds/news/plosone/