Bambara groundnut is one of the most economical important legume crops in Burkina Faso. In addition to be a source of income for farmers, it is an important source of energy (387 kcal/100g) which can contribute to the fight against malnutrition and undernourishment . Its high protein content, over 19%  makes it a food that can replace animal protein in poor populations. However, its production faces enormous constraints of which viral diseases are one of the most important. The main viruses reported globally on the plant include Cowpea mild mottle virus (CPPMV, carlavirus), Voandzeia mosaic necrosis virus (VMNV, tymovirus), Cowpea mottle virus (CPMoV, carmovirus), Southern bean mosaic virus (SBMV, sobemovirus), Cucumber mosaic virus (CuMV, cucumovirus) as well as a number of potyviruses, namely Cowpea aphid aphid-borne mosaic virus (CABMV), Bean common mosaic virus blackeye strain (BCMV-BlCM), Voandzeia distortion mosaic virus (VDMV), Peanut mottle potyvirus (PnMV)  . Viruses of genus Potyvirus are major factors of the reduction of Bambara groundnut production by causing yield losses ranging between 13% and 100% in cowpea  . Potyviruses are mostly transmitted by aphid vectors and through virus contaminated seeds, which makes their control difficult. In Burkina Faso, some studies on the characterization of CABMV and BCMV-BlCM were done in Bambara groundnut  . However, very few virus isolates were included in these studies, which was a limited help in assessing potyvirus diversity. The objective of this study was to identify Bambara groundnut infecting potyviruses in Burkina Faso and characterize their molecular diversity based on the analysis of complete coat protein (cp) sequences.
2. Material and Methods
2.1. Samples Collection
From 2016 to 2018 and during the August to October growing seasons, 135 symptomatic Bambara groundnut leaf samples were collected from farmers’ fields across the three agroclimatic zones of Burkina Faso. Agroclimatic zones included the Sudan zone with annual rainfall of 900 - 1100 mm, the Sudan-Sahel zone (600 - 900 mm) and the Sahel zone (less than 600 mm). In order to cover virus diversity, sampling sites were separated by at least 3 km. One hundred samples were collected in the Sudan zone and the Sudan-Sahel zone (50 samples per zone) whereas 35 samples were from the Sahel zone. Leaf Samples were kept on ice before being transferred to the laboratory where they were stored at −80˚C before molecular analyses.
2.2. Total RNA Extraction
Total RNA from the 135 samples were extracted individually with Trizol (Invitrogen) according to the protocol described by . Before the extraction, the mortars were first sterilized during 2 hours at 180˚C and then kept at −80˚C until use. Briefly, RNA extraction followed was done according to the following steps: first, crushed leaves were transferred into 2 ml Eppendorf tubes and 1 ml of Trizol was added to release cellular content; secondly, 200 µl of chloroform were added. The tubes were kept on ice for 5 min before centrifugation at 14,000 rpm for 15 min at 4˚C. RNA was precipitated by adding 550 μl of cold isopropanol and keeping the tubes at −20˚C for 30 min. Finally, the tubes were then centrifuged at 12,000 rpm for 10 min and the pellets were washed with 75% ethanol. Total RNA was suspended in 30 µl sterile water. RNA quality was checked and concentration measured by a nanodrop before storage at −20˚C.
2.3. Potyvirus Detection by RT-PCR
Detection of potyviruses was carried out in an RT-PCR assay using the universal potyvirus primers P077-F: ATGGTHTGGTGYATHGARAAYGG and P078-R: CARATGAARGCMGCAGCA . Viral cDNAs were obtained from the extracted RNAs using the reverse oligodT primer and the enzyme M-MLV-RT (200 U) (Promega, USA) according to the supplier’s recommendations. Reverse transcription was done in a total volume of 25 µl containing 1 μl of total RNA, 1 μl of 10 mM oligodT, 5 μl of M-MLV RT 5X Buffer, 1 μl of 10 mM dNTP, 200 U M-MLV-RT and sterile water.
PCR amplification was done using the Solis Biodis PCR kit according to the supplier’s instructions. A reaction volume of 25 μl containing 2 μl of cDNA, 2.5 μl of 10x PCR buffer, 0.5 μl of 10 mM dNTP, 2.5 μl of 25 mM MgCl2, 0.5 μl of each primer P077-F and P078-R (10 μM) and 0.5 U of the enzyme FIREPol DNA polymerase subjected to the following cycling conditions: 95˚C for 5 min followed by 35 cycles of 94˚C for 1 min, 55˚C for 30 sec and 1 min of elongation at 72˚C and a final elongation step performed at 72˚C for 10 min. The amplified fragments were analyzed by electrophoresis on 1% (w/v) agarose gels containing ethidium bromide and visualized under UV to verify the size of the amplified fragments. Expected size fragments (327 bp) were sequenced and the sequences were blasted against the GenBank (NCBI) database.
2.4. Potyvirus Complete Coat Protein Gene Sequencing
Samples positive for potyvirus detection were used for amplification of the complete cp gene. Therefore, independent PCRs were performed using primer combinations P105-F/P078-R and P077-F/P106-R (Table 1) as described by . Expected size fragments were sequenced, assembled and blasted against GenBank NCBI. New primers were designed (Table 1) based on the sequences obtained and taking into account closely related potyvirus sequences in GenBank database. Independent PCR amplifications were performed as described above using the newly designed primers and conditions indicated in Table 1. Amplicons of the expected sizes were directly sequenced using respective primer pairs.
2.5. Sequence Analysis
All sequences contigs were edited and assembled using DNAMAN software to
Table 1. List of primers used for the RT-PCR amplification and sequencing of the capsid protein gene.
generate complete CP sequences. The cp sequences were compared to GenBank database sequences by Blastn NCBI (BLAST, http://www.ncbi.nlm.nih.gov/blast). Nucleotide and protein sequence identities between among query and retrieved sequences were determined using SDT v1 software . Relative synonymous codon usage (RSCU) was determined using Seqinr .
Phylogenetic trees were reconstructed from complete CP sequences using the MEGA 6 software . After Clustal alignment, the Maximum Likelihood (ML) method was applied for the phylogenetic trees reconstruction using the TN93 + G + I and Protein JTT + G + I nucleotide substitution models, with a bootstrap of 1000 replications. The amino acid composition and the nucleotide diversity between the different groups of potyviruses identified by this study were determined with the Mega 6 software.
3.1. Potyvirus Detection in Bambara Groundnut
As illustrated in Figure 1, amplicons of expected size (327 bp) were obtained in RT-PCR tests for potyvirus detection in Bambara groundnut using potyvirus universal primers. The universal potyvirus primers  associated to the blast NCBI  analyses were effective in detecting potyviruses in 36 isolates out of 135 samples analyzed, indicating a virus prevalence of about 26.7%. Twenty of these isolates (20/50 samples) originated from Sudan zone (40% in prevalence), sixteen (16/50 samples) from the Sudan-Sahel zone (30% in prevalence), while no positive samples were detected in the Sahel zone. This distribution makes the Sudan zone the most infested by the Bambara groundnut potyviruses in the country.
3.2. Nucleotide Diversity in Potyviruses Coat Protein Sequences
Complete CP sequences were obtained from 24 isolates out of 36 positive samples for potyvirus detection. Full CP sequence could not be obtained from the remaining isolates due the insufficient quality of corresponding sequencing data. The 24 full CP sequences were submitted to the Genbank database (Table 2).
Figure 1. RT-PCR-amplified potyvirus products fractionated using 1% agarose gel electrophoresis and ethidium bromide staining. EB: Sample Code; M: 100 bp DNA size marker; T+ and T−: positive and negative controls, respectively.
Table 2. Samples of Bambara groundnut potyviruses with the corresponding accession numbers in Genbank.
Nucleotide and amino acid analysis of the coat protein sequences showed a high virus diversity. As shown in Table 3, three groups of potyvirus sequences designated group 1 (7 isolates), group 2 (4 isolates) and group 3 (13 isolates) were differentiated. Within group nucleotide identity was 94.6% - 100%, 93.7% - 100% and 94.2% - 100% in group 1, group 2 and group 3, respectively. Higher amino acid identity appeared in each group, reaching 97.8% - 100% in group 1, 97.7% - 100% (group 2) and 97.8% - 100% (group 3). Percentages of nucleotide and amino acid identity were significantly lower between groups. At nucleotide level, isolates in group 1 shared roughly 72% - 74% identity with isolates of group 2 or isolates of group 3. Nucleotide identity between group 2 and group 3 ranged between 75.8% and 77.7%. Further between group comparisons indicated that both nucleotide and at amino acid identities showed similar trends.
Isolates in the current study were also compared to the closest virus isolates from the Genbank database. As presented in Table 3 isolates in group 1 shared high sequence identity only with CABMV isolates at both nucleotide (94.6% - 100%) and amino acid (97.8% - 100%) levels. Percentages of nucleotide or amino acid identity with any closest isolate were less than 77%. Isolates in group 2 showed only 70.6% to 77.7% nt identity and less than 80% aa identity with any closest virus isolate outside the group. The highest nucleotide identity (76.9% -
Table 3. Capsid protein nucleotide and amino-acid identity between Bambara groundnut potyvirus groups.
*CABMV: Cowpea aphid-borne mosaic virus; BCMNV: Bean common mosaic necrosis virus; BCMV: Bean common mosaic virus; UPV: Ugandan passiflora virus; PaChV: Passiflora chlorosis virus. Group 1 (MK987189; MK987190; MK987191; MK987192; MK987193; MK987194; MK987195); Group 2 (MK987196; MK987197; MK987198; MK987199); Group 3 (MK987200; MK987201; MK987202; MK987203; MK987204; MK987205; MK987206; MK987207; MK987208; MK987209; MK987210; MK987211; MK987212).
77.1%) and amino acid identity (78.4% - 79.9%) were found with Bean common mosaic necrosis virus (BCMNV). Moreover, group 2 isolates had 780 bp nucleotide sequence similar to BCMNV (786 bp). Isolates of the two other groups had 828 bp, which made a difference of 48 pb. In group 3 isolates, the highest nucleotide identity (77.3% - 78.3%) and amino acid identity (80.7% - 81.5%) were found with Ugandan Passiflora virus [J896003].
3.3. Coat Protein Features and Relative Synonymous Codon Usage
The twenty amino acids were found in all three groups but CP sequences in Group 2 isolates consisted of 259 aa which was 16 less amino acids than in the other two groups (275 aa). All CP amino acid sequences displayed the N-terminal DAG motif which is conserved in most potyviruses.
Table 4 summarizes the codon utilization bias in the three potyvirus groups. Codon utilization bias among the potyvirus groups was apparent in all degenerate amino acids. The three groups of potyvirus shared the same preferred codons in some amino acids. However, in some cases, additional preferred codons discriminated the three groups of potyviruses. For example, in leucine the three groups have different codon preferences, group 1 prefers CUU, UUG for group 2 and CUG for group 3. However, in most amino acids group 2 and 3 have the
Table 4. Relative synonymous codon usage in the Bambara groundnut potyvirus groups.
aRSCU: relative synonymous codon usage in Bambara groundnut potyvirus group 1 (G1), group2 (G2) and group 3 (G3). The most preferred codons are in boldface.
same preferences. Group 1 of potyvirus also differs from group 2 and group 3, preferably using GGG codons for glycine and UCU for Serine. Another distinction between the potyvirus groups was evident in the preferred codon usage for asparagine (AAC in group 1 versus AAU in groups 2 and 3) and by histidine (CAC in groups 1 and 2 with respect to CAU in group 3). The clearest divergences between the potyvirus groups were observed in groups that did not use specific codons for certain amino acids. This was exemplified by codon UGU (Cys) in group 1, codons UGC (Cys) and CGC (Arg) in group 2 and codons CGA (Arg) and UCC (Ser) in group 3.
3.4. Phylogenetic Diversity of Bambara Groundnut Potyviruses
Phylogenetic analysis performed on the nucleotide and protein sequences of the 24 isolates along with homologous sequences retrieved from GenBank clearly distinguished clades that corresponded to the three groups of potyviruses described previously (Figure 2). In both nucleotide and protein based phylogenetic
Figure 2. Phylogenetic trees reconstructed using nucleotide and amino acid sequences of coat protein. The Maximum Likelihood method was applied according to the Tamura-Nei model for nucleotide tree (A) and Jones-Taylor-Thornton model for amino acid tree (B). The sequences characterized in this study are shown in color and corresponding accession numbers in Genbank database are in indicated by the letters MK followed by 6 digits numbers. PFWV: Passion fruit woodiness virus; BCMNV: Bean common mosaic necrosis virus; CABMV: Cowpea. aphid-borne mosaic virus; PnMV: Peanut mottle virus; BlCMV: Blackeye cowpea mosaic virus; BCMV, Bean common mosaic virus; UPV: Ugandan passiflora virus; PaChV: Passiflora chlorosis virus.
trees, group 1 isolates clustered with CABMV isolates only. Notably, this CABMV group included also two isolates (MF277031 and MF277034) recently characterized from Bambara groundnut in Burkina Faso . Group 2 and group 3 formed two new clusters which did not include any already known virus isolate. Therefore, group 2 and group 3 were referred to as Bambara groundnut potyvirus 1 (BGPV1) and Bambara groundnut potyvirus 2 (BGPV2), respectively.
3.5. Geographical Distribution of Bambara Groundnut Potyviruses
Figure 3 presents the geographical distribution of the three groups of potyviruses characterized on Bambara groundnut in this study. BGPV1 isolates were found in the Sudan zone only whereas both CABMV and BGPV2 isolates were found in the Sudan and Sudan-Sahel zones. However, discrepancies were observed between the two agroclimatic zones. CABMV isolates were most frequent in the Sudan Sahel zone (25% of the isolates) than in the Sudan zone (8.3%). By contrast, most BGPV2 isolates (29.2%) were found in the Sudan zone while only 4.2% of the isolates originated from the Sudan Sahel zone. Altogether, all three virus groups were found in the Sudan zone which held 70% of virus isolates.
Viruses of the genus Potyvirus were clearly detected by RT-PCR from field samples of Bambara groundnut using widely used potyvirus universal primers    . In addition, sequencing of the CP gene and sequence analysis confirmed that virus isolates belonged to the genus Potyvirus. Moreover, all protein sequences displayed at their N-terminal ends, the DAG motif which is known to be conserved in aphid-transmissible potyviruses  .
Figure 3. Geographical distribution of Bambara groundnut potyviruses in Burkina Faso. CABMV, Cowpea aphid-borne mosaic virus; BGPV, Bambara groundnut potyvirus.
Analysis of full CP sequences resulted in assigning virus isolates to three groups, which was confirmed by phylogenetic and relative synonymous codon usage (RSCU) analyses. RSCU is reported to be a species-specific statistic . Therefore, it may useful for species description. According to the International Committee on Taxonomy of Viruses, potyvirus demarcation criteria, viruses belong to the same species if their nucleotide identities in the cp gene is more than 76% - 77% and CP protein identity is greater than 82%  . Therefore, group 1 isolates are closely related to CABMV with a nucleotide identity. Based on these criteria, all three potyvirus groups represented a distinct virus species. Virus isolates from group 1 shared more than 94% with CABMV at both nucleotide and amino acid levels. Consequently, they were considered as CABMV isolates. Phylogenetic analysis of CABMV isolates revealed some intraspecies diversity whereby isolates from this study split in two groups. This consistent with earlier studies on Bambara groundnut CABMV isolates .
Unlike group 1, Isolates in group 2 and group 3 were not closely related to any particular virus species. On the one hand, isolates from group 2 shared 77% nucleotide identity with BCMNV or Passiflora chlorosis virus but their amino acid identity with these two viruses did not exceed 79% which was clearly below the 82% demarcation limit. On the other hand, isolates from group 3 shared no more than 78.3% nt identity and 81.5% aa identity with their closest virus species (Passiflora virus Ugandan [J896003]). Altogether, Bambara groundnut potyviruses in groups 2 and 3 appeared as distinct potyviruses which are characterized for the first time. Thus, the names Bambara groundnut potyvirus 1 (BGPV1) and Bambara groundnut potyvirus 2 (BGPV2), respectively, have been proposed.
The distribution of Bambara groundnut potyviruses in agroclimatic zones was characterized by more viruses occurring in the Sudan zone. This is in agreement with results found in cowpea viruses . In this study, no potyvirus was found in Bambara groundnut in the Sahel zone as previously reported by . This is likely due to the rare occurrence of potyvirus infections in the Sahel zone where climatic conditions are less favorable to the buildup of aphid vector populations . In addition, Bambara groundnut cultivation is less common in the Sahel zone more prone to drought.
Potyviruses infecting Bambara groundnut in Burkina Faso were characterized at molecular level. In addition to the CABMV, two putative new potyvirus species referred to as BGPV1 and BGPV2 were reported for the first time. All three viruses occurred in the more humid Sudan zone, whereas only two were found in the Sudan Sahel zone and none in the Sahel zone. Further molecular characterization of BGPV1 and BGPV2 isolates through the determination of full genomic sequences associated with the determination of key biological properties is needed for better description of Bambara groundnut potyviruses.
This study was supported financially by the International Foundation for Science (IFS) through fellowship No. C/5884-1 to A.E. Zongo. We gratefully acknowledge the Laboratoire Mixte International Patho Bios, Burkina Faso, for providing the technical platform.