The perennial grass Imperata cylindrica (L.) P. Beauv., a common and persistent weed in many food crops such as like cassava, maize, sorghum and rice is considered as traditional and important medicinal plant in several African country such as Uganda, Ghana and Cameroun, where the roots of I. cylindrica have been described as snakebite treatment in Uganda  . At Ghana, the properties of management of hypertension have been identified to I. cylindrica leaf extract while in Cameroon the properties of management of typhoid fever have been identified to I. cylindrica   .
Recently, only one viral disease has been reported that affects I. cylindrica, and was defined as the Imperata yellow mottle virus (IYMV) because of the typical mottled yellowing appearing at the I. cylindrica leaf surface. IYMV was first characterized in I. cylindrica in West Africa in 2008  and classified as a new member of the sobemovirus genus. Like all sobemoviruses, IYMV is readily transmitted mechanically. Up to now, natural infection with IYMV has been observed and demonstrated conclusively in Zea mays and I. cylindrica  . Experimentally, the virus has a crop host range including two cereals (Sorghum bicolor, Pennisetum glaucum)  and three wild grasses (Rottboellia exaltata Setaria verticillata, Brachiaria xantholeuca   . Contrary to other sobemoviruses, it remains unknown whether insects such as beetles or even the I. cylindrica seeds themselves can serve as vector for IYMV infection.
IYMV is a positive single stranded RNA virus with the particle of 32 nm in diameter. Its genome is 4.447 nucleotide long and comprises five ORFs  . ORF1 (45 - 686 nt), which is located at the 5’ end of the genome, encodes a P1-like protein. P1 is involved in the cell-to-cell and systemic movement of the virus  . ORF2, has two overlapping ORFs, encodes the putative central polyproteins. ORF2a (713 - 2509 nt) encodes a serine protease and a viral genome-linked protein (VPg), and ORF2b (2176 - 3768 nt), encodes a RNA dependant RNA polymerase (RdRp). ORF4 (3560 - 4381 nt) is translated from the subgenomic RNA at the 3’ end of the genome and encodes the coat protein. Recently, the presence of a ﬁfth ORF (ORFx), conserved was reported in all sobemovirus  . Such putative fifth ORF is also present in the IYMV genome, and overlaps the 5’ end of the ORF2a in the +2 reading frame (Figure 1).
Until now, only one complete IYMV genomic sequence from western region of Burkina Faso (West Africa) had been published  . The molecular diversity is therefore not documented and several important factors of epidemiology of IYMV are still poorly understood, such as alternative hosts in fields. Nevertheless, the knowledge of IYMV genetic diversity is essential for a better description of its aetiology, pathogenicity, and ecology develop appropriate strategies to counteract the IYMV spread and disease. The most common molecular markers for investigation of genetic diversity of the genus sobemovirus and other plants virus is a coat protein   . In addition, on the basis of coat protein genes sequences, various viruses have been grouped  . The aim of this study was therefore to investigate the genetic variability based on molecular analyses of CP gene sequences originated from IYMV isolates obtained from different locations of Burkina Faso.
Figure 1. A schematic representation of the genome organization of IYMV; see text for details.
2. Materials and Methods
2.1. Survey and Sample Collection
Imperata cylindrica leaves showing viral symptoms of Imperata yellow mottle virus infection were collected in 10 different locations belonging to the high bassins region (Bama, Banzon, N’Dorola, Koloko, Tondogosso, Karangasso Sambla) and cascades regions (Banfora, Karfiguela, Lomouroudougou and Niangoloko) of Burkina Faso (West Africa), as indicated in Figure 2. One virus sample isolated from an individual I. cylindrica plant was considered as one isolate. Infected I. cylindrica plants were either used for extraction of total RNA or stored at 80˚C for future use.
2.2. RNA Extraction, RT-PCR Amplifications and Sequencing
Total RNA was extracted from frozen infected Imperata cylindrica leaves using the RNeasy Plant Mini Kit (Qiagen), according to manufacturer’s instructions. Slight modifications were made on the protocol to optimize the quality and quantity of the total RNA. The quality of RNA extraction was compared by measurement of RNA concentration.
Reverse-transcription (RT) was performed using the primers IYMV-R4438-4454 while Polymerase Chain Reaction (PCR) was performed using IYMV-F3483-3502 and IYMV-R4385-4394 described by Koala et al., 2017. All steps and conditions, including, RT and PCR followed the protocol of koala 2017  . All PCR products of the correct size were purified from 1% agarose gels using GENECLEAN turbo Protocols columns before being sent to Genewiz (Essex, UK) for sequencing.
2.3. Recombination and Genetic Diversity Analysis
The sequences contigs obtained in this study were assembled using the Seqman II program in the DNASTAR 10.0 (DNAStar Inc., Madison, USA). The 38 sequences were then compared and analyzed with the available GenBank accession NC-011536 sequence (Table 1). Multiple nucleotide sequence alignments were performed by using CLUSTAL W with default parameter  .
Alignments were also adjusted manually to guarantee correct reading frames. Noncoding sequences were removed before alignment.
As frequent recombination can provide a false positive signal for positive selection in codon specific analytical methods this paragraph is necessary. So, you need to identify and remove recombinant sequences before implemented selection pressure acting on CP genes. Interestingly, this analysis could provide important results which can improve the paper quality.
Figure 2. Geographical location of IYMV sample collection sites and symptoms on infected Imperata cylindrica in its natural habitat. (a) A Map of Burkina Faso showing the south-Western (in yellow) where sampling was done; (b) Precise rural provinces within the two South-Western regions where IYMV was detected and collected; (c) Typical mottle yellowing of Imperata cylindrica leaves guiding plant harvests.
Thus, possible recombination events were analysis using the models RDP, GENECONV, Bootscan, MaxChi Chimaera, SiScan and 3Seq implemented in the software package Recombination Detection Program (RDP, version 4.85)  -  . The default detection thresholds were used. Only events supported by three kinds of methods were retained.
Pairwise genetic distances among nucleotide and amino acid sequences were calculated using the Kimura’s two parameters  and using the Jones Taylor Thornton (JTT) model implemented in MEGA v.6.0  . To evaluate variation in selection pressure, during CP evolution, the direction and degree of selective constraints operating in a coding region were assessed by the ratio between nucleotide diversities at nonsynonymous and synonymous positions (dNS/dS).
The extent of IYMV variation among these sequences was evaluated using the index π by DnaSp version 5.0. With a sliding window of 100 nt and a step size of 25 nt. The parameter π is the mean number of nucleotide differences per site between two sequences to measure the nucleotide diversity.The value assigned to the nucleotide was that of the window midpoint.
2.4. Construction of Phylogenetic Trees
Phylogenetic relationships between isolates were inferred by maximum-likelihood (ML) methods. The best fitting nucleotide substitution model with the lowest BIC score was determined using MEGA v.6.0  . ML analyses were performed under the T92 + G + I model. Isolate CP-BF1 (GenBank
Table 1. IYMV Isolates identified in different sub regions of South-Western Burkina Faso (BF). The unique IYMV sequence identified prior to this analysis  is given as the referent accession NC-011536.
accession number: AJ279901.1) of Rice yellow mottle virus (RYMV) was used as outgroup for phylogenetic analysis. Robustness of phylogenetic relationships was assessed by 1000 bootstrap replications.
3.1. Imperata cylindrica Harvest Campaigns Identify Up to 38 IYMV Isolates in South Western Burkina Faso
Within the frame of a 3-year harvest campaign in distinct areas of South-Western Burkina Faso (Figure 2), a total of 38 samples of I. cylindrica leaves were analyzed for Imperata yellow mottle virus detection.
As expected, RT-PCR on total RNAs from infected plant materials resulted in the amplification of DNA fragments of about 1000 bp for all sample listed in Table 1. PCR amplifications representative of different plants are shown (Figure 3).
3.2. Recombination Analysis
In total, four Potential Recombinant Events (PREs) named PRE_iymv34-BF, PRE_NC-011536, PRE_iymv2-BF and PRE_iymv9-BF were detected by at least one of the models (Figure 4). PRE_iymv34-BF have been the result of recombination of the major NC-011536 with an iymv38-BF minor parent. PRE_NC- 011536 have been the result of recombination of the major parent iymv29-BF with an iymv2-BF minor parent. PRE_iymv2-BF shows the recombination between iymv29-BF as the major parent and Unknown (iymv25-BF) as the minor parent. These PREs (iymv34-BF, NC-011536, iymv2-BF) were detected by MaxChi methods with average P-value 2760 × 10−5, 1068 × 10−2 and 4298 × 10−2 respectively. PRE_iymv34-BF also have been the result of recombination between iymv37-BF as the major parent and iymv38BF as the minor parent in the Chimaera and SiScan methods with average P-value 1393 × 10−2, 1393 × 10−3, respectively. Finally, PRE_iymv9-BF show the recombination between iymv38- BF as the major parent and Unknown (NC-011536) as the minor parent. This recombination event was detected by Chimaera method with average P-value > 1.0.
Figure 3. RT-PCR mediated molecular diagnostic for IYMV occurrence in I. cylindrica in Burkina Faso. Lane M: 1kb DNA size standard. Lanes 1 to 5: IYMV infected leaves from five individual plants. Lane+: PCR product from the referent accession sample NC-011536 of the Imperata yellow mottle virus (IYMV). Lane-: RT-PCR control performed without plant RNAs.
Figure 4. Description of potential recombination events. PRE_iymv34-BF, PRE_NC-011536, PRE_iymv2-BF and PRE_iymv9-BF are the Potential Recombinant Events (PREs). See text for details of major and minor parents.
These four potential recombinants were detected by one or two methods of RDP program with a low degree of confidence. In addition, one of the parental isolates was often unknown. Based on the criteria of recombination selection, these Potential Recombination events were not accepted. No evidence for potential recombination events was found among the other isolates using RPD4.
3.3. Sequence Analysis
The average of genetic diversity among the 39 listed in Table 1 was 4.6% for nt, with the peak (7.6%) of nucleotide substitutions per site between sequences present at the 5’ half N-terminal protein coding region (Figure 5). The average number of nucleotide substitutions per synonymous sites was high (πs = 0.164), yet 18 times higher than the number of nonsynonymous diversity (πa = 0.009), i.e. a ω ratio (πa/πs) of 0.07. The maximum of the nonsynonymous and synonymous diversity between two any sequences was 2.1% and 25%, respectively. As ω < 1, this suggests that the CP sequences are under high purifying selective constraints. The p-value of the Z test was highly significant (P < 0.001) and confirmed that, diversification in the CP gene of the BF isolates was found under a strong purifying selection. Using Fisher’s codon based exact test included in MEGA v.6.0 there was no evidence for positive selection (data not shown)  .
Total number of nucleotide sites of the 39 IYMV sequences was 822 nt encoding 273 amino acids. The 273 aa residues were dominated by hydrophobic amino acids.
Analyses of the polymorphic sites among sequences of the Burkina Faso isolates revealed 136 variable sites for nucleotide and 24 for amino acid sequences. Indeed, 13% and 10% of amino acids changes resulted of mutations at 1st and 2nd nt positions of codons, respectively. We also noted that conserved amino acid sequence of CP of IYMV exhibit several common features of sobemoviruses. The N-terminal region is rich in basic amino acids and contains an arginine
Figure 5. Distribution of IYMV genetic variation estimated by nucleotide diversity (π). The sliding window was 100 sites wide with slide set at 25 site intervals.
rich region predicted to encode a nuclear localization signal and essential for encapsidation. According the two common features of bipartite signal, the two first basic amino acids an arginine and lysine was detected in majority of isolates, in part, and the consensus bipartite targeting motif RKSKKMT13QAAAVKNQQL23APSRR was detected at position 721.
In Addition, basic amino acids (arginine, lysine, proline, and glutamine) located in N-terminal region (16) and responsible for coat protein contacts with the RNA were observed in clade 1. Its amino acids were also observed in clade 2 to 6 except proline which were replaced by threonine and lysine at position 13 and 23 respectively. Amino acid predicted to be involved in Ca2+ binding (two residues of aspartic acid [D139, D142], one of valine [V197] and one of asparagine [N252]) were conserved in all isolates  .
3.4. Phylogenetic Analysis
A total 39 CP gene sequence were analyzed. The phylogenetic relationships among the sequences were constructed using maximum-likehood methods (Figure 6). The 39 CP nt sequences revealed segregation of the isolates under study into six clades.
Clade I was composed of (5) isolates from Tondogosso, (1) from Bama, (1) from Banfora and (1) from Banzon. Clade II included of (1) isolate from Karangasso Sambla, (1) isolate from Koloko, (1) from N’Dorola and (1) from Tondogosso. Clade III included (3) isolates from Banfora, (1) isolate from N’Dorola, (1) isolates from Koloko. Clade IV included (2) isolates from Bama, (2) isolates from karfiguela, (2) from Lomouroudougou and (1) isolate from Niangoloko. Clade V included (2) isolates from N’Dorola, (2) isolates from Banzon, (2) isolates from Lomouroudougou, (1) isolate from Banfora and (1) isolate from Karfiguela. Clade VI included (3) isolates from Banfora, (1) isolate from Banzon, (1)
Figure 6. Phylogenetic analysis of 39 IYMV isolates from Western region of Burkina Faso based on the CP gene. The isolate CP_BF1 of RYMV from Burkina Faso was used as an out Clade. Number below branches are bootstrap percentages Scale bar indicates a genetic distance of 0.1.
isolate from N’Dorola and (1) isolate from Karfiguela.However, the Clade I and Clade II are only poorly supported (33% and 26% bootstrap values respectively, (Figure 6).
To assess the genetic diversity of IYMV, we compared CP sequences of 39 isolates from different areas in Burkina Faso. Analysis of CP nucleotide sequences revealed that the global genetic diversity (4.6%) was low according the low CP sequence diversity (3% - 10%) of other RNA viruses   . In spite of low genetic diversity of IYMV populations, phylogenetic analysis showed that the isolates of Burkina Faso diverged into six Clades (Clade I to Clade VI). However, the clade I and clade II are poorly supported (33% and 26% bootstrap values respectively, Figure 6). Analysis of distribution of IYMV isolates according to the geographic origin indicates that isolates collected in the same locality belong to two different clades, whereas isolates from distant areas clustered in the same clade. Similarly, at amino acids level, two isolates collected from distant areas belong in the same clade. These results suggest lack of correlation between genetic diversity and geographic distribution of IYMV isolates. Similar results were also reported for Tobacco mild green mosaic virus (TMGMV) and Citrus tristeza virus (CTV) that infected perennial crops such as Nicotiama glauna and citrus species respectively    . The fact that Imperata cylindrica is a perennial grass, a nonfood and designated as a noxious weed in agricultural and nonagricultural fields the West Africa prevent the exchange of Imperata cylindrica propagation material. The spread via rhizomes is main mechanism of spread of Imperata cylindrica although some research indicated a spread by seed dispersal  . Therefore, Imperata cylindrica cannot spread to very long distance. The heterogeneous sequences between isolates of IYMV could be explained by the great potential of genetic variation in Imperata cylindrica reported recently  . Indeed, perennial grass survive a long time in nature, adapting to different environmental conditions and consequently sometimes involves the development of new ecotypes. This is also true for viruses that infect perennial plants to maintain themselves and adapt to new environmental conditions  . Indeed, during the adaptation to the new conditions the multiplication of viruses is accompanied by various mutations due to the lack of repair process associated with their RNA dependent RNA polymerase  . In addition, it have been reported that the purifying selection often results in amino acid changes with functional or structural modifications such as genome protection, cell-to-cell movement, transmission between plants, interactions with the host and/or vector, etc.  .
IYMV CPs sequences are under high purifying selective constraints, the structuration of phylogenetic clades revealed in our analysis that the structure in six clades were associated with amino acid changes, particularly the R-domain region of coat protein (1 - 66) (Table 2). These amino acid substitutions are consequent as shown by the strong changes of amino acid physicochemical properties: (P) 13 (T), (P) 28 (L) (P is hydrophilic, T and L are hydrophobic but T is a polar); (G) 65 (A) (G and A are hydrophobic but G is polar uncharged); (A) 66 (S) (A is and S are polar and hydrophobic); (N) 268 (D) (N and D are hydrophilic but N is polar).
Among these amino acid substitutions in the R-domain region, we noticed particularly two amino acid changes with threonine instead proline and leucine instead proline at the position 13 and position 18 respectively. It is well established that the exceptional conformational rigidity of proline affects the secondary structure of protein suggesting strong change for the N-terminus coat protein properties  .
Interestingly, we noticed that changes occurring particularly in the bipartite sequence (RKSKKMTQAAAVKNQQLAPSRR) of the IYMV CP which allowed to distinguish clade I from other others.
Table 2. Multiple alignment of CP amino acid sequences of IYMV Burkina Faso isolates from different localities. Consensus sequence obtained with CLUSTAL W algorithm is shown above the alignment as a consensus/majority. The amino acids identical to the consensus are indicated by points within the alignment.
Significance C, C I: Clade I, C II: Clade II, CIII: Clade III, C IV: Clade IV, C V: Clade V, CVI: Clade VI.
Several authors have shown that the bipartite targeting sequence plays an essential role in addressing the CP protein to the nucleus  . However, although the basic residues R and K have similar properties we cannot say whether all substitutions at different positions in the bipartite targeting sequence have structural consequences on RNA encapsidation, stability of viral particles or other unknown properties of CP during the biological cycle of IYMV. It remains to be determined if these amino acids substitutions involve biologically distinct strains.
This is the first study of the genetic diversity of IYMV in Burkina Faso and we think that will allow contributing to a better understanding of IYMV evolution and epidemiology in Burkina Faso. In addition, the diagnosis using the specific primers will make of useful tool for population structure studies of IYMV in Burkina Faso. Although, its results are a prerequisite for further management of imperata yellow mottle disease, it would be interesting to study the genetic diversity in the neighboring countries such as in Mali and Benin (Data not shown) where the presence of IYMV has been suspected. As the global diversity of IYMV is low, it would be interesting to obtain the complete sequence of other proteins in a number of other viral protein from the different isolates representative of the 6 clades. In this context, will be particularly interesting to sequence the P1 protein as it has been demonstrated for Rice yellow mottle virus  that P1 displayed the highest diversity in the RYMV genome, and the VPg protein has is the major determinant for resistance breaking in RYMV  .
Financial supports for this study were provided in part by the Mixed International Laboratory LMI Patho-Bios (www.pathobios.com), by the PROVEG program (Proveg: a network on plant protection through Program to support network based research in Africa, PARRAF http://proveg.org/accueil) and by the International Foundation for Science (IFS) through fellowship N˚ C/5358-1 to Moustapha KOALA. M.K was also the recipient of a financial support by the project n˚1102004 funded by the International Agropolis Foundation for his education & training in molecular cloning at IRD Institute, France. We thank Nils Poulicard for helpful discussions.
 Mak-Mensah, E.E., Komlaga, G. and Terlabi, E.O. (2010) Antiypertensive Action of Ethanolic Extract of Imperata Cylindrica Leaves in Animal Models. Journal of Medicinal Plants Research, 4, 1486-1491.
 Sérémé, D., Lacombe, S., Konaté, M., Pinel-Galzi, A., Traoré, V.S.E., Hébrard, E., Traoré, O., Brugidou, C., Fargette, D. and Konaté, G. (2008) Biological and Molecular Characterization of a Putative New Sobemovirus Infecting Imperata Cylindrica and Maize in Africa. Archives of Virology, 153, 1813-1820.
 Koala, M., Traoré, V.S.E., Sérémé, D., Neya, B.J., Brugidou, C., Barro, N. and Traoré, O. (2017) Imperata Yellow Mottle Virus an Emerging Threat to Maize, Sorghum and Pearl Millet in Burkina Faso. Agricultural Sciences, 8, 397-408.
 Bonneau, C., Brugidou, C., Chen, L., Beachy, R.N. and Fauquet, C. (1998) Expression of the Rice Yellow Mottle Virus P1 Protein in Vitro and in Vivo and Its Involvement in Virus Spread. Virology, 244, 79-86.
 Vigne, E., Bergdoll, M., Guyader, S. and Fuchs, M. (2004) Population Structure and Genetic Variability within Isolates of Grapevine Fanleaf Virus from a Naturally Infected Vineyard in France: Evidence for Mixed Infection and Recombination. Journal of General Virology, 85, 2435-2445.
 Cuevas, J.M., Delaunay, A., Rupar, M., Jacquot, E. and Elena, S.F. (2012) Molecular Evolution and Phylogeography of Potato Virus Y Based on the CP Gene. Journal of General Virology, 93, 2496-2501.
 Neeraj, V., Mahinghara, B.K., Ram, R. and Zaidi, A.A. (2006) Coat Protein Sequence Shows That Cucumber Mosaic Virus Isolate from Geraniums (Pelargonium spp.) Belongs to Subgroup II. Journal of Biosciences, 31, 47-54.
 Thompson, J.D., Higgins, D.G. and Gibson, T.J. (1994) CLUSTAL W: Improving the Sensitivity of Progressive Multiple Sequence Alignment through Sequence Weighting, Position-Specific Gap Penalties and Weight Matrix Choice. Nucleic Acids Research, 22, 4673-4680.
 Martin, D.P., Posada, D., Candall, K.A. and Williamson, C. (2005) A Modified Bootscan Algorithm for Automated Identification of Recombinant Sequences and Recombination Breakpoints. AIDS Research and Human Retroviruses, 21, 98-102.
 Posada, D. and Crandall, K.A. (2001) Evaluation of Methods for Detecting Recombination from DNA Sequences: Computer Simulations. Proceedings of the National Academy of Sciences, 98, 13757-13762.
 Gibbs, M.J., Armstrong, J.S. and Gibbs, A.J. (2000) Sister-Scanning: A Monte Carlo Procedure for Assessing Signals in Recombinant Sequences. Bioinformatics, 16, 573-582.
 Kimura, M. (1980) A Simple Method for Estimating Evolutionary Rates of Base Substitutions through Comparative Studies of Nucleotide Sequences. Journal of Molecular Evolution, 16, 111-120.
 Koichiro, T., Glen, S., Daniel, P., Alan, F. and Sudhir, K. (2013) MEGA6: Molecular Evolutionary Genetics Analysis Version 6.0. Molecular Biology and Evolution, 30, 2725-2729.
 Garcia-Arenal, F., Fraile, A. and Malpica, J.M. (2001) Variability and Genetic Structure of Plant Virus Populations. Annual Review of Phytopathology, 39, 157-186.
 Rubio, L., Ayllon, M.A., Kong, P., Fernandez, A., Polek, M., Guerri, J., Moreno, P. and Falk, B.W. (2001) Genetic Variation of Citrus Tristeza Virus Isolates from California and Spain: Evidence for Mixed Infections and Recombination. Journal of virology, 75, 8054-8062.
 Fraile, A., Malpica, J.M., Aranda, M.A., Rodriguez-Cerezo, E. and Garcia-Arenal, F. (1996) Genetic Diversity in Tobacco Mild Green Mosaic Tobamovirus Infecting the Wild Plant Nicotiana glauca. Virology, 223, 148-155.
 Yager, L.Y., Miller, D.L. and Jones, J. (2011) Woody Shrubs as a Barrier to Invasion by Cogongrass (Imperata cylindrica). Invasive Plant Science and Management, 4, 207-211.
 Chiang, Y.C., Tsai, C.C., Hsu, T.W. and Chou, C.H. (2012) Characterization of 21 Microsatellite Markers from Cogongrass, Imperata cylindrica (Poaceae), a Weed Species Distributed Worldwide. American Journal of Botany, 99, e428-e430.
 Sérémé, D., Séverine, L., Konaté, M., Bangratz, M., Pinel-Galzi, A., Fargette, D., Traoré, A.S., Konaté, G. and Brugidou, C. (2014) Sites under Positive Selection Modulate the RNA Silencing Suppressor Activity of Rice Yellow Mottle Virus Movement Protein P1. Journal of General Virology, 95, 213-218.
 Traoré, O., Pinel-galzi, A., Issaka, S., Poulicard, N., Aribi, J., Aké, S., Ghesquière, A., Séré, Y., Konaté, G., Hébrard, E. and Fargette, D. (2010) The Adaptation of Rice Yellow Mottle Virus to the eIF (iso) 4G-Mediated Rice Resistance. Virology, 408, 103-108.