Rosa rugosa is a famous traditional Chinese flower of the genus Rosa. It is not only an important flowering plant, but its fruits also have considerable ornamental, edible and medicinal values. Ornamental fruit rugosa is an emerging ornamental plant and has become the new favorite in landscape greening because of its beautiful color and appearance, high fruit setting amount and long fruit setting period. However, Rosa rugosa has gametophytic self-incompatibility (GSI) and inbreeding does not produce fruits    . Therefore, artificial selection to pollinate plants is necessary for Rosa rugosa in gardening. Special attention must be given to the selection of varieties and plant spacing to avoid fruitlessness and low fruitfulness, which will otherwise impair the ornamental and application values of Rosa rugosa. It is important to overcome the self-in- compatibility of Rosa rugosa and breed new varieties of ornamental fruit rugosa with self-compatibility   . But so far the mechanism of self-incompatibility of Rosa rugosa has not been reported yet.
Previous studies have shown that like other species of the family Rosaceae such as apricot and pear, Rosa rugosa also displays S-RNase-mediated gametophytic self-incompatibility, which is regulated by S-RNase gene from style and SFB/SLF gene from pollen    . Ushijima et al. first cloned pollen-specific SFB/SLF gene from Prunus dulcis in 2003  . Later pollen-specific SFB/SLF gene has been cloned from Prunus mume, Cerasus avium, European apricot, Prunus salicina and Prunus armeniaca. Many researchers believe that the pollen- specific SFB/SLF gene can encode for F-box proteins that can specific recognize and ubiquitinate heterogenous S-RNase, resulting in compatible reaction between pistil and pollen    . The discovery of pollen-specific SFB/SLF gene facilitates the investigation into the mechanism of GSI reaction, but further experiments are needed to prove that the F-box proteins are involved in the interaction between SCF complex and substrate protein     . The plant species identified with self-incompatibility are generally those with partial self-incompatibility at different levels. Rosa rugosa, however, has complete self- incompatibility and therefore serves as an ideal test material for understanding the mechanism of gametophytic self-incompatibility    .
In this study, we attempted to clone the SFB gene from the Rosa rugosa pollen and make bioinformatics analysis. The purpose was to provide clues for understanding the mechanism of GSI on the molecular level, not only of Rosa rugosa, but also of other plant species in a broader sense.
2. Material and Methods
2.1. Plant Material
The plant material, Chinese representative R. rugosa “Zilong Wochi”, was from the rose germplasm resources garden at Shandong Agricultural College. Rosa rugosa “Zilong Wochi” is the most representative traditional rose in China.
2.2.1. Pollen Preserved
Between May 2016 and June 2016, the robust “Zilong Wochi” anthers were collected at 5:00-6:00 pm the day before blooming. The anthers were taken to the lab to dry powder and collected the pollen, the styles were collected and flash frozen with liquid nitrogen and then stored in a −80˚C freezer.
2.2.2. Total RNA Extraction and cDNA Synthesis
An EASYspin plant RNA Rapid Extraction Kit from Adlai Biotechnology Co., Ltd. was used to extract the total RNA from the R. rugosa pollen tissue. Agarose gel electrophoresis and spectrophotometer were used to determine the quality and concentration of the RNA. An EasyScript First-Strand cDNA Synthesis SuperMix Kit from Bei-jing TransGen Biotech Co., Ltd. was used to synthesize the first-strand cDNA.
2.2.3. Cloning of the Middle Fragment
According to the reported SLF sequences of Rosaceae, the degenerate primers F1 (5’-CATCTACTCTGCCTCCACCA-3’) and R1 (5’-GAAAGAAAGACCATTGA-AGAGC-3’) were designed with Primer Premier 5.0. PCR amplification was conducted using the synthesized cDNA in Section 2.2.2 as a template and F1 and R1 as the primers. The reaction system included 1 µL cDNA, 1 µL F1 primer (10 µmol/L), 1 µL R1 primer (10 µmol/L), and 12.5 µL PCR MIX, with ddH2O added to a total volume of 25 µL. The reaction conditions were: 94˚C for 3 min; 94˚C for 30 s, 55˚C for 30 s, and 72˚C for 30 s for a total of 36 cycles; and then extension at 72˚C for 10 min. Next, 1% agarose gel electrophoresis was used to detect the PCR products. The target PCR fragment was recovered with the MiniB-EST Agarose Gel DNA Extraction Kit Ver. 3.0 (TaKaRa). The recovered fragment was ligated to the pMD18-T vector and then transformed into E. coli DH5a. The positive clones were selected and sent to BGI for sequencing.
2.2.4. 3’ RACE and 5’ RACE
The 3’ RACE specific primers MG1 (5’-GGACGAAGTTTTGAATAGCAGGAGT-3’) and MG2 (5’-AATTTAAGACGCTTCCATCGACCAC-3’) and the 5’ RACE specific primers GSP1 (5’-CCTCCAAATCGACCAC-3’), GSP2 (5’-GGTGGTGGAGGCAGAGTA-3’), and GSP3 (5’-CATATTTCTGCGTTTTGTGA-3’) were all designed with Primer Premier 5.0. Nested PCR was conducted using MG-1, MG-2, and the SMARTer™ RACE cDNA Amplification Kit (Clontech) in order to obtain the 3’-terminal sequence of the target gene. Nested PCR was also conducted using GSP1, GSP2, GSP3, and the 5’ RACE System for Rapid Amplification of cDNA Ends (Version 2.0, Invitrogen) in order to obtain the 5’-ter-minal sequence of the target gene.
2.2.5. Full-Length Gene Sequence Splicing and Verification
DNAstar software was used to splice the middle fragment, the 5’-terminal sequence, and the 3’-terminal sequence in order to obtain the full-length cDNA sequence of the gene. The 5’- and 3’-primers for the spliced sequence were designed with Primer Premier 5 as follows: F2 (5’-ATGACGTCCACAATTTGTAAGAA-3’) and R2 (5’-TTAATTCGGTAATACCAAACTTTC-3’). The spliced sequence was amplified using the re-verse transcription product of cDNA as a template, and then, it was further validated and verified.
2.2.6. Bioinformatics Analysis of Gene
BLASTX (NCBI) was used to study the homology of the nucleotide sequence and the deduced amino acid se-quence. The ORF finder (NCBI) was used to search for an open reading frame, and the Conserved Domains da-tabase (NCBI) was used to analyze the conserved domains. The ProtParam Tool was used to analyze protein physical and chemical properties. Post Prediction, WOLF PSORT, and SubLocv were used to predict protein sub-cellular localization. Furthermore, ProtScale was used to predict hydrophilic or hydrophobic protein proper-ties. The SignalP 4.0 Server was used to predict the protein signal peptide. The TMHMM Server v2.0 was used to predict the protein transmembrane domain. The NetPhos 2.0 Server was used to predict potential protein phosphorylation sites, and the NetNGlyc 1.0 Server and NetOGlyc 3.1 Server were used to predict potential protein glycosylation sites. ExPaSy-SOPMA was used to predict protein secondary structure. DNAMAN5. 2.2 was used to conduct multiple sequence alignment. The Neighbor-Joining method from Mega5 was used to create the phylogenetic tree.
3. Results and Analysis
3.1. Cloning of the Rosa rugosa SLF Gene
The cloned middle fragment is 401 bp (Figure 1(a)), the cloned 3’-terminal fragment is 751 bp (Figure 1(b)), and the cloned 5’-terminal fragment is 267 bp (Figure 1(c)). These three fragments were spliced together with DNAstar in order to obtain a 1236 bp cDNA sequence. The spliced sequence was then validated by PCR ampli-fication (Figure 1(d)). In addition, the Blast analysis confirmed that all its homologous genes are the SFB/SLF gene and named RrSLF (GenBank accession number: KY446808).
3.2. Bioinformatics Analysis of the RrSLF Gene
The RrSFB gene has a full length of 1236 bp, an open reading frame of 1122 bp, a 5’ UTR of 61 bp, and a 3’ UTR of 53 bp, encoding 343 amino acids. The derived protein (the RrSLF1 protein) has a molecular weight of 43.7 kD, an isoelectric point of 6.24, a F-box conserved domain at position 343 - 741. Thus RrSFB protein belongs to the F-box family. Furthermore, the subcellular localization prediction result indicated that the protein is probably located at the cytoplasm. The hydrophilicity analysis further showed that the overall average hydrophobic index is 0.716, thus indicating a hydrophobic protein. The signal peptide predic-
Figure 1. PCR amplification of S Locus F-box cDNA. (a) Intermediate ragment; (b) 3’-RACE; (c) 5’-RACE; (d) Full-length fragment.
tion result demonstrated that no signal peptide cleavage site, thus indicating a non-secretory protein. The transmembrane domain analysis showed that no transmembrane domain exists. The phosphorylation site prediction results demonstrated that there are twenty-one Ser phosphorylation sites, seven Thr phosphorylation sites, and seven Tyr phosphorylation sites, thereby providing a reference for the future study of the regulation of gene expression and protein modification. The glycosylation site prediction results showed that there is two N-glycosylation site and no O-glycosylation sites. The secondary structure prediction result demon-strated that there is 22.25% α-helix, 31.37% random coil, 32.17% extended peptide chain, and 14.21% β-corner. The BLAST results showed that the protein shares 59% - 61% homology with the SFB/SLF amino acid sequences of Rosaceae Prunus fruit including Prunus speciosa (ADZ76515.1), Prunus armeniaca (AAT69249.1), Prunus pseudocerasus (ADZ74124.1), Prunus salicina (BAF91849.1), and 22% - 30% homology with the SFB/SLF amino acid sequences of Non Rosaceae plants include Petunia x hybrida (ADD21613.1), Solanum lycopersicum (NP_001316390.1)、Populus trichocarpa (RP65220.1), Antirrhinum hispanicum (CAD56853.1). The multiple sequence alignment result demonstrated that the RrSFB protein and the above plant SFB/SLF amino acid sequences all have a F-box conserved domain, two hypervariable regions HVa, HVb, and two variable regions V1, V2 (Figure 2). Furthermore, the constructed phylogenetic tree revealed that RrSLF is closely related to SFB/SLF from the same family member Prunus pseudocerasus, Prunus avium, and Prunus speciosa, whereas it is relatively distant from Petunia x hybrida, Solanum lycopersicum and Populus trichocarpa, which are from different families, consistent with the traditional classification results (Figure 3).
4.1. Relationship between the S-Locus Gene and GSI
GSI is controlled by S-locus with allelic variants. The S-locus consists of at least two genes: one is specifically expressed in the styles and termed style-specific S-gene; the other is specifically expressed in the pollen grains and termed grain-
Figure 2. Multiple alignment of the RrSLF with other SLF. Notes: The color represents the homology of the gene sequence. The deeper the color, the stronger the homology.
Figure 3. The phylogenetic tree derived from the alignment of amino acid secquences of RrSLF and other SLF.
specific S-gene    . Many studies on GSI are concerned with style- specific S-gene (S-RNase gene)    .
4.2. Bioinformatics Analysis of the RrSLF Gene
There has been a major breakthrough in the pollen-specific S-gene in the family Rosaceae, Scrophulariaceae and Solanaceae. The SFB/SLF gene has been identified as the most potential candidate gene for the pollen-specific S-gene. In these families, the pollen-specific SFB/SLF gene is localized downstream of the pollen-specific S-RNase gene, with transcription in the reverse direction. Pollen-specific SFB/SLF gene consists of one F-box domain, two hypervariable regions and two variable regions. The F-box domain and one variable region are located at the N-terminus of the amino acid sequence; the two hypervariable regions and another variable region are located at the C-terminus. We found that in RrSFB gene, the N-terminal amino acid sequence consists of one F-box domain and one variable region (V1); the C-terminal amino acid sequence consists of two hypervariable regions (HVa, HVb) and one variable region (V2), which agrees with the previous findings. Ushijima et al. believed that like S-RNase gene, the hypervariable regions of the SFB/SLF gene are the sites where self- incompatibility is acting. The recognition ability of the SFB/SLF and S-RNase genes can be decreased by site-directed mutation. The RrSLF gene obtained in this study contained two hypervariable regions at the C-terminus, with HVa exhibiting higher polymorphism than HVb. Self-compatibility mutation can be generated by interfering with the expression of the hypervariable regions and by altering the unique recognition of the S-gene. This method can serve as a new strategy for breeding self-compatible Rosa rugosa varieties by genetic transformation.
4.3. Homology Analysis of the RrSLF Gene
The majority of the studies on pollen-specific S-gene are conducted in the family Rosaceae, Scrophulariaceae and Solanaceae, especially in the genus Prunus. Blast alignment indicated that the amino acid sequence homology of the SFB/SLF gene between species of the genus Prunus (eg., Prunus speciosa) is about 80%; the amino acid sequence homology of the SFB/SLF gene between Rosa rugosa and species of the genus Prunus is only about 60%; the amino acid sequence homology of the SFB/SLF gene between Rosa rugosa, species of the genus Prunus, and Petunia x hybrida belonging to another family is less than 30%. The above results indicate high phylogenetic variability of the amino acid sequence of the SFB/SLF gene. We further constructed the phylogenetic tree based on the SFB/SLF gene and found that the RrSFB gene had the smallest phylogenetic distance from the SFB/SLF gene derived from the species of the same family and the longest phylogenetic distance from the SFB/SLF gene derived from the species of different families. This agrees with the conventional plant classification. It is inferred that the evolution of the SFB/SLF gene corresponds with the phylogenetic relationship among the plant species from which the gene is derived.
4.4. Limitation of Our Study
If we want to make sense of the difference between the Rosa rugosa SLF and others is the origin of evolution or different ecological groups, the specific mechanism needs further study. The relationship between the pollen gene and the mechanism of self incompatibility of Rosa rugosa also needs to be separated and identified, which can provide valuable experience for further study on the mechanism of self incompatibility.
This work was funded by the National Science Foundation of China (NSFC) (31200524) and the Postdoctoral Science Foundation of China (2013M531640).
*These authors contribute equally.
 Matsumoto, D. and Tao, R. (2016) Recognition of a Wide-Range of S-RNases by S locus F-Box like 2, a General-Inhibitor Candidate in the Prunus-Specific S-RNase-Based Self-Incompatibility System. Plant Molecular Biology, 91, 459-469.
 Williams, J.S., Natale, C.A., Wang, N., et al. (2014) Four Previously Identified Petunia inflata S-Locus F-Box Genes Are Involved in Pollen Specificity in Self-Incompatibility. Molecular Plant, 7, 567-569.
 Donia, A., Ghada, B., Hend, B.T., et al. (2015) Identification, Evolutionary Patterns and Intragenic Recombination of the Gametophytic Self Incompatibility Pollen Gene (SFB) in Tunisian Prunus, Species (Rosaceae). Plant Molecular Biology Reporter, 34, 339-352.
 Habu, T. and Tao, R. (2014) Transcriptome Analysis of Self- and Cross-pollinated Pistils of Japanese Apricot (Prunus mume Sieb. et Zucc.). Journal-Japanese Society for Horticultural Science, 83, 95-107.
 Ashkani, J. and Rees, D.J.G. (2016) A Comprehensive Study of Molecular Evolution at the Self-Incompatibility Locus of Rosaceae. Journal of Molecular Evolution, 82, 1-18.
 Xu, C., Li, M., Wu, J., et al. (2013) Identification of a Canonical SCF SLF, Complex Involved in S-RNase-Based Self-Incompatibility of Pyrus, (Rosaceae). Plant Molecular Biology, 81, 245-257.
 Ushijima, K., Sassa, H., Dandekar, A.M., et al. (2003) Structural and Transcriptional Analysis of the Self-Incompatility Locus of Almond: Identification of a Pollen-Expressed F-Box Gene with Haplotype-Specific Polymorphism. Plant Cell, 15, 771-781.
 Sims, T.L., Patel, A. and Shrestha, P. (2010) Protein Interactions and Subcellular Localization in S-RNase-Based Self-Incompatibility. Biochemical Society Transactions, 38, 622-626.
 Chen, G.A., Zhang, B., Zhao, Z.H., et al. (2010) ‘A Life or Death Decision’ for Pollen Tubes in S-RNase-Based Self-Incompatibility. Journal of Experimental Botany, 61, 2027-2037.
 Sassa, H., Kakui, H. and Mai, M. (2010) Pollen-Expressed F-Box Gene Family and Mechanism of S-RNase-Based Gametophytic Self-Incompatibility (GSI) in Rosaceae. Plant Reproduction, 23, 39-43.
 Mai, M., Kakui, H., Wang, S., Kotoda, N., et al. (2010) Apple S. locus Region Represents a Large Cluster of Related, Polymorphic and Pollen-Specific F-Box Genes. Plant Molecular Biology, 74, 143-154.
 Franceschi, P.D., Pierantoni, L., Dondini, L., et al. (2011) Evaluation of Candidate F-Box Genes for the Pollen S of Gametophytic Self-Incompatibility in the Pyrinae (Rosaceae) on the Basis of Their Phylogenomic Context. Tree Genetics & Genomes, 7, 663-683.
 Sonneveld, T., Tobutt, K.R. and Robbins, T.P. (2003) Allele-Specific PCR Detection of Sweet Cherry Self-Incompatibility (S) Alleles S1 to S16 Using Consensus and Allele-Specific Primers. Theoretical and Applied Genetics, 107, 1059-1070.
 Vaughan, S.P., Russell, K., Sargent, D.J., et al. (2006) Isolation of S-Locus F-Box Alleles in Prunus avium and Their Application in a Novel Method to Determine Self-Incompatibility Genotype. Theoretical and Applied Genetics, 112, 856-866.
 Chen, G., Zhang, B., Liu, L., et al. (2012) Identification of a Ubiquitin-Binding Structure in the S-Locus F-box Protein Controlling S-RNase-Based Self-Incompatibility. Journal of Genetics and Genomics, 39, 93-102.
 Yamane, H., Ushijima, K., Sassa, H., et al.(2003) The Use of the S Haplotype-Specific F-Box Protein Gene, SFB, as a Molecular Marker for S-Haplotypes and Self-Compatibility in Japanese Apricot (Prunus mume).Theoretical and Applied Genetics, 107, 1357-1361.
 Mcclure, B.A. and Franklin-Tong, V. (2006) Gametophytic Self-Incompatibility: Understanding the Cellular Mechanisms Involved in “Self” Pollen Tube Inhibition. Planta, 224, 233-245.