Genetic diversity at the single nucleotide polymorphism (SNP) level has been exploited in many branches of science and medicine. SNPs have been used to determine an individual’s susceptibility to a variety of disease states ranging from breast cancer to cardiovascular diseases. Additionally, polymorphisms can be used to predict a drug’s activity within a tissue. In pharmacogenetics, polymorphisms in genes encoding drug metabolism enzymes (DME) are used to help predict a drug’s potency because the absorption, distribution, metabolism and excretion (ADME) of specific compounds can all be affected by genetic variations. Known polymorphisms in those genes are known as ADME SNPs. Additionally, SNPs are also commonly used as molecular markers in breeding and genetic research.
Multiple different chemistries or assays have been developed for SNP genotyping including mass spectrometry, oligonucleotide arrays, single stranded conformational polymorphism, and sequencing. Some of the most widely used technologies tend to be centered on a fluorescent based PCR assay (i.e. 5’ nuclease, Molecular Beacons, Scorpion primers, KASP, and Invader)  . With all these methods, the allele-specific discrimination ability is solely based on a single factor, either probe hybridization or primer extension.
Here we present a new SNP genotyping method, rhAmp SNP, that combines the allele-specific hybridization and extension into a single assay. rhAmp SNP genotyping relies on two enzymes, RNase H2 (an endoribonuclease) and a mutant Taq DNA polymerase with enhanced allelic discrimination. The rhAmp SNP genotyping primers contain a single RNA base and are 3’ end blocked. In addition, each of the allele-specific primers has a unique universal tail that incorporates sequence into the amplicon that is complementary to the reporter sequences. Rapid de-blocking of the primers by RNase H2 only occurs upon formation of a perfectly matched heteroduplex between the blocked RNA-containing primer and DNA target. Once de-blocked by RNase H2, the newly activated 3’ end of the primer lays over the SNP site, and is interrogated by a mutant Taq DNA polymerase for allele-specific PCR. The major advantages of the rhAmp SNP genotyping chemistry include enhanced allelic discrimination by rhPCR, and high signal generation and cost reduction by the universal reporter system.
2. Materials and Methods
2.1. SNPs, Genomic DNA Samples and Extraction
SNP targets: A total of 130 SNPs were randomly selected from dbSNP Build 144 (https://www.ncbi.nlm.nih.gov/projects/SNP/, Supplemental Table S1) for most of the analyses in this study, covering the types of SNPs found in the human genome. A set of 1000 randomly selected common bi-allelic SNPs were used to evaluate the pipeline design rate.
Synthetic templates: Synthetic gBlocks® Gene Fragments (Integrated DNA Technologies, https://www.idtdna.com/pages/products/genes/gblocks-gene-fragments) were used as known genotype controls during the initial assay validation stage (data not shown). gBlocks Gene Fragments representing the wild type and mutant allele were mixed together in an equal molar ratio, representing the heterozygous genotype. Concentrations were determined using the Nano Drop™ 2000 (Thermo Fisher Scientific).
Genomic DNA samples: Human genomic DNA samples were purchased from the Coriell Institute for Medical Research. A total of 136 unique samples in two sets were obtained, representative of the three original HapMap populations: Yoruba from Ibadan (YRI), Han Chinese from Beijing, China (CHB), and CEPH Utah residents (CEU) (Supplemental Table S2). DNA samples were quantified using a qPCR method targeting the RNase P (RPPH1) gene.
2.2. rhAmp SNP Genotyping
SNP genotyping was carried out using rhAmp SNP Assays, rhAmp Genotyping Master Mix, and rhAmp Reporter Mix with or without a passive reference dye (www.idtdna.com/rhAmp-Genotyping). Genotyping reactions were performed typically using 3 ng of dried down genomic DNA from either 46 (human SNP panel) or 90 (ADME SNP panel) samples (Coriell Institute) in 5 µL reactions. Unless otherwise noted, reactions were run on the QuantStudio 7 Flex instrument and analysis was performed using the QuantStudio Real-Time PCR Software (Thermo Fisher Scientific, CA). For DNA input testing, the gDNA was reduced from 3 ng to 125 pg per reaction. rhAmp SNP genotyping reactions were run with a thermal cycling profile of 95˚C for 10 minutes, followed by 40 cycles at 95˚C for 10 seconds, 60˚C for 30 seconds, and 68˚C for 20 seconds per the published protocol (www.idtdna.com/rhAmp-SNP-protocol). The same thermal cycling program was used on the CFX384 (BioRad, CA) platform, and reactions contained rhAmp Reporter Mix without passive reference dye. The IntelliQube (LGC Group) data was collected using a thermal profile of 93.5˚C for 10 minutes, followed by 40 cycles at 93.5˚C for 10 seconds, 60˚C for 30 seconds, and 68˚C for 20 seconds. The reactions contained rhAmp SNP genotyping master mix at a final 0.85x concentration, rhAmp SNP assay and reporter mix with passive reference dye both at final 1X concentration, and Coriell gDNA input at 1.6 ng per reaction. With the Biomark HD (Fluidigm, CA) platform, 23 gDNA samples from a subset of 90 (ADME SNP panel) at 150 ng per inlet was used according to the manufacturer’s recommendation. A modified protocol without additional HotStar Taq polymerase was performed. The thermal mix protocol was modified to the following: 37˚C for 2 minutes, 45˚C for 10 minutes, then 25˚C for 10 minutes. The reactions were run with the thermal cycling at 95˚C for 10 minutes, followed by 40 cycles at 95˚C for 10 seconds, 60˚C for 30 seconds, and 68˚C for 20 seconds. The call rate was defined as the percent of samples with an assigned SNP genotype call, and the call accuracy was the percent of called samples with the correct genotype assigned. The reported call rate and call accuracy were determined using auto-calls assigned by the QuantStudio Real-Time PCR Software to each of 46 samples for 130 assays.
2.3. TaqMan SNP Genotyping
SNP genotyping was carried out using TaqMan Genotyping Master Mix (Thermo Fisher Scientific) and either TaqMan SNP Genotyping Assays or TaqMan Drug Metabolism Genotyping Assays (Thermo Fisher Scientific) (assay ID numbers listed in Supplemental Table S3). SNPs were assessed using 3 ng dried down genomic DNA samples in 5 µL reactions. TaqMan SNP Genotyping Assay reactions were run with the following thermal cycling profile: 95˚C for 10 minutes, followed by 40 cycles at 95˚C for 15 seconds and 60˚C for 1 minute. TaqMan Drug Metabolism Genotyping Assay reactions were run with the following thermal cycling profile: 95˚C for 10 minutes, followed by 50 cycles at 95˚C for 15 seconds and 60˚C for 90 seconds. Thermal cycling profiles were based on the manufacturer recommended protocols. Cycling and allelic discrimination analysis was performed using the QuantStudio 7 Flex instrument and QuantStudio Real-Time PCR Software Autocaller (Thermo Fisher).
2.4. Assay Design
Universal reporter system: Of 10 million candidate sequences screened, a small set of universal primers and probes that met stringent thermodynamic properties (identical length, GC%, Tm, etc.) and had high sequence specificity against the human genome were selected. We further narrowed the probe set so that they did not form dimers and the pairwise edit distance was at least 4 for minimal interference among probes. A pair of the universal reporter probes were selected based on empirical testing (data not shown).
rhAmp SNP assay design pipeline: We have developed a sophisticated assay design pipeline that performs the following steps for each target SNP: 1) design all possible allele-specific primers (ASP), locus-specific primers (LSP) and their combinations; 2) evaluate the thermodynamic properties of the primers and amplicons, such as length, percent GC, Tm, sequence complexity, folding, etc.; 3) assess the impact of overlapping SNPs and repeats; 4) analyze the dimerization of assay primers, including self- and hetero-dimers; 5) evaluate primer compatibility, i.e. having similar Tm and length, particularly among ASPs; 6) determine the assay specificity against the human genome. Finally, based on the best combination of all the above parameters, an assay with the best overall quality is selected.
3.1. Description of rhAmp SNP Genotyping
The schematic describing rhAmp SNP genotyping is shown in Figure 1. Each rhAmp SNP assay is comprised of one locus-specific primer (LSP) and two tailed allele-specific primers (ASPs). Each primer has a single RNA base and a 3’ end
Figure 1. Schematic of the rhAmp SNP genotyping system. (A) SNP allele C template DNA is shown as a red oval on a solid black line. Solid lines with blocking group (X) at the 3’ ends indicate a 3’ end-blocked locus-specific primer (LSP) and two 3’ end-blocked allele-specific primers (ASPs), with green and blue lines at the 5’ end indicating allele-specific tail sequences. One ASP contains the DNA base complementary to the target SNP reference allele, and the other ASP contains the DNA base complementary to the alternative allele, shown as a red oval and yellow oval, respectively. The blue triangle with R symbol in the ASPs and LSP represents the RNA base. Blue round shapes represent RNase H2. (B) RNase H2 recognizes the perfectly matched RNA-DNA heteroduplex and cleaves the primers to release the blocked 3’ end, thereby activating the primers for PCR, while the mismatched ASP remains blocked. The purple water drop shape represents mutant Taq DNA polymerase, and dashed black lines represent extension by the polymerase. (C) Extension from the ASP incorporates the allele-specific tail sequence into the amplicon, followed by extension of the LSP to form the complement strand. A green circle and black circle represent a fluorophore and quencher, respectively, and are connected by a linker sequence (green line), together representing a dual-labeled probe. The probe and universal primer (black arrow) anneal to the strand complementary to the incorporated allele tail sequence. During PCR, the 5’ nuclease activity of the Taq polymerase degrades the probe, releasing the fluorophore and generating fluorescence signal.
blocking modification. The RNA base in the ASP is located immediately downstream of the target SNP location (Figure 1(A)).
The blocked primers are efficiently activated by the RNase H2 enzyme only in the presence of a perfectly matched RNA-DNA heteroduplex. Once the RNase H2 cleaves the primer, the 3’ end is available for extension by the mutant Taq DNA polymerase (Figure 1(B)). Each ASP has a unique 5’ tail sequence that becomes incorporated into the allele-specific amplicon. The tail region adds the necessary sequences for the binding of a universal primer and fluorescent probe (Figure 1(C)). The assays are designed such that the reference allele gives a signal in the FAM dye channel, and the alternative allele signal is detected in the VIC® dye channel. The alternate allele fluorophore is Yakima Yellow®, which has a very similar excitation and emission spectra as VIC dye. With such a similar excitation and emission profile, no spectral re-calibration is required.
3.2. Comparison of Wild-Type vs. Mutant Taq DNA Polymerase
While the RNase H2 enzyme exhibits significant differences in the rate of enzymatic cleavage of an RNA base in the presence or absence of a mismatch RNA-DNA heteroduplex  , the rhAmp SNP assay also requires a DNA polymerase to complete allele-specific PCR. A novel mutant Taq DNA polymerase was developed to function effectively in the same buffer as the RNase H2 enzyme and provide improved allelic discrimination in comparison to the wild-type Taq DNA polymerase. Figure 2 shows the performance difference in the allelic discrimination (AD) plot between two master mixes, one containing the wild type Taq DNA polymerase (Figure 2(A)) and the other with mutant Taq DNA polymerase (Figure 2(B)), with an example assay designed to detect SNP rs2269829. A total of 46 different human gDNA samples were tested at 3 ng of gDNA input per reaction. With wild-type Taq DNA polymerase, the three genotype clusters can be automatically called, however, the cluster separation is relatively poor, likely due to ASP mispriming, and some of the no template control replicates drift from the origin. Replacing the Taq with the mutant version improves the separation, tightness, and angles of the genotype clusters.
3.3. Performance of rhAmp SNP Genotyping Assays
Of 1000 common human SNPs randomly selected from the public dbSNP database, rhAmp SNP assay designs were successfully generated for 950 targets (95%). Functional test results from a randomly selected subset of 130 assays tested with 46 genomic DNA samples (human SNP panel) are summarized in Table 1. Two examples of rhAmp SNP genotyping assays targeting human SNPs, rs6068816 and rs4148946 are shown in Figure 3. Overall performance of rhAmp SNP assays shows 98% call rate and 99% accuracy (concordance with published genotypes in the NCBI database) while maintaining a high assay design rate.
Figure 2. A novel mutant Taq DNA polymerase improves allelic discrimination. rhAmp SNP genotyping master mixes formulated with either wild type (A) or mutant (B) Taq DNA polymerase were compared in a rhAmp SNP genotyping assay targeting a human SNP (rs2269829) in the presence of 3 ng gDNA from 46 individuals. With improved specificity at the SNP site, reactions containing mutant Taq DNA polymerase result in lower non-specific signal and greater cluster angle separation.
Table 1. Performance summary of rhAmp SNP assays. Design rate is based on 1,000 randomly selected common bi-allelic SNPs from NCBI. Call rate and call accuracy performance testing was generated for 130 targets using 46 Coriell gDNA samples.
Effect of DNA input on rhAmp SNP genotyping
DNA input can vary due to factors such as limited biological samples, purification loss, and other causes. Low sample input may impact the quality of SNP genotyping calls due to lower fluorescent signals. Therefore, it is important to determine the DNA input range that can provide robust genotyping calls. We tested the lower limit of input DNA for the rhAmp SNP system (Figure 4). A rhAmp SNP assay targeting a human SNP (rs4657751) was tested in the presence of gDNA at 0 (no template control or NTC), 0.125 (blue), 0.5 (green) and 3 (orange) ng per PCR reaction on the QuantStudio 7 Flex instrument. Genotypes for 46 individual gDNA samples are accurately auto-called with 0.5 and 3 ng gDNA and can be manually called with 0.125 ng gDNA. For some assays tested, accurate genotypes could be manually called using sample input as low as 25 pg
Figure 3. Examples of rhAmp SNP assays targeting two human SNPs, rs6068816 (A) and rs4148946 (B), each with reference allele C and alternate allele T. Post-PCR read data was collected and normalized reporter signal (Rn) for allele 1 (FAM) and allele 2 (Yakima Yellow) is plotted along X- and Y-axes, respectively. The allelic discrimination plot displays three distinct genotype clusters including homozygous for the reference allele C/C (red), heterozygous C/T (green) and homozygous for the alternate allele T/T (blue). The assays were run on 46 human gDNA samples in 5 μL reactions.
Figure 4. Effect of gDNA input on rhAmp SNP genotyping. Allelic discrimination plot for a rhAmp SNP assay targeting a human SNP (rs4657751) is shown in the presence of gDNA at 0 (no template control), 0.125 (blue), 0.5 (green) and 3 (orange) ng per reaction. Genotypes for 46 individual gDNA samples are accurately auto-called with the QuantStudio Real-Time PCR Software in reactions containing 0.5 and 3 ng gDNA, and can be manually called with 0.125 ng gDNA.
(data not shown).
Compatibility of rhAmp SNP genotyping with various qPCR platforms
All rhAmp SNP assays in this study were initially evaluated on the QuantStudio 7 Flex (Thermo Fisher) and CFX384 (BioRad) instruments using 3 ng genomic DNA or 1000 copies gBlocks Gene Fragments. To demonstrate the compatibility of small volume platforms, some of the assays were tested on the IntelliQube (LGC Group) and Biomark HD system (Fluidigm). An example of allelic discrimination plots for a human ADME SNP rs7668258 in the UGT2B7 gene is shown on four different platforms: QuantStudio 7 Flex (Figure 5(A)), CFX384 (Figure 5(B)), IntelliQube (Figure 5(C)) and Biomark HD (Figure 5(D)). For the QuantStudio 7 Flex and CFX384, the assay was tested with 3 ng gDNA from 90 individuals and a reaction volume of 5 µL. The same gDNA samples were tested on the IntelliQube with 1.6 ng per reaction and reaction volume of 1.6 µL. For the Biomark HD, the assay was tested with 23 Coriell gDNA samples from a subset of the 90 ADME SNP gDNA panel samples, each at 150 ng per inlet. As noted in the methods, the thermal mix protocol is modified to achieve maximum assay sensitivity and specificity. rhAmp SNP assays perform well on these four major SNP genotyping platforms, with similar cluster angles and concordant genotyping calls achieved across all samples tested.
3.4. TaqMan vs. rhAmp SNP Genotyping
A robust SNP genotype call requires relatively high fluorescent signal, large cluster angle separation and tight clusters. In a study of 18 target SNPs, rhAmp SNP assays generated, on average, at least two-fold higher signal than TaqMan for both alleles (Supplemental Figure S1). The allelic discrimination plot and raw fluorescence of an ADME SNP rs776746 assayed by rhAmp SNP and TaqMan genotyping chemistries was compared (Figure 6). The rhAmp SNP assay achieves higher signal for all samples (Figure 6(A)), and more uniform signal for both alleles in heterozygote samples (Figure 6(B)), resulting in a heterozygote cluster angle closer to the ideal 45 degrees.
SNP genotyping using target-specific fluorogenic probes or allele-specific primers, such as TaqMan  and KASP assays  , are two commonly used methods in medicine and agriculture. A new method, rhAmp SNP genotyping, which is based on allele-specific rhPCR combined with a universal fluorogenic reporter system, was first reported by Broccanello et al.  . This new SNP genotyping method has several advantages over both TaqMan and KASP.
First, the rhAmp SNP genotyping chemistry is a dual enzyme system with improved specificity of the allele-specific PCR, resulting in better genotyping cluster angles and separation (Figure 2). Dobosy and his colleagues  compared specificity of allele-specific PCR using unmodified PCR primers to that of 3’ blocked rhPCR primers. Of 12 mismatched combinations, the mid-range discrimination in ΔCq (mismatched Cq ? matched Cq) averaged 7.4 for unmodified primers and 10.9 for 3’ blocked primers, suggesting that rhPCR is at least 5-times better in allelic discrimination than standard PCR  . Secondly, the rhAmp SNP genotyping allele-specific and locus-specific primers are inactive
Figure 5. Compatibility of rhAmp SNP genotyping with various qPCR platforms. Allelic discrimination plots for a rhAmp SNP assay targeting human ADME SNP rs7668258 are displayed with genotyping auto-calls assigned to samples run on the QuantStudio 7 Flex (A), CFX384 (B), IntelliQube (C) and Biomark HD (D). Allele 1 and Allele 2 calls indicate homozygous genotypes. Allele1/Allele2 call indicates heterozygous genotype.
Figure 6. Comparison of rhAmp and TaqMan SNP genotyping. Assays targeting human ADME SNP rs776746 in the CYP3A5 gene were tested with 3 ng gDNA from 90 individuals. Allelic discrimination plot for all samples show higher signal and better cluster separation for the rhAmp assay (A) and multicomponent amplification curves for 5 samples with heterozygote genotype show higher and more uniform signal for the rhAmp SNP assay (B).
initially due the 3’ end blocking group. Only upon hybridization to its perfectly matched target is the 3’ blocking group readily removed by RNase H2 cleavage. Therefore, primer dimers are eliminated or significantly reduced in the absence of target DNA (data not shown). High assay design rate (Table 1) is achieved in part due to relaxed rules checking primer to primer interactions. Thirdly, the rhAmp SNP genotyping method eliminates or reduces non-specific signal from universal reporters often caused by primer dimers. Some KASP assays can generate false genotyping clusters in the absence of targets or NTC wells, often due to non-specific interactions  . Non-specific interactions between TaqMan probes and PCR primers also cause higher NTC signal in some TaqMan SNP genotyping assays  . Next, reduced primer dimers and the use of universal reporters with the rhAmp SNP genotyping system make multiplex SNP genotyping possible. Simultaneous detection of two SNPs in a single PCR reaction will reduce the assay cost and increase the throughput (unpublished data). Finally, use of the universal reporter system for any SNP or any species not only generates high fluorescent signals (Figure 6) but also makes SNP genotyping cost-effective.
In conclusion, we have successfully applied a new rhPCR method for SNP genotyping from purified gDNA on existing qPCR instruments such as the QuantStudio 7 Flex, CFX384, IntelliQube, and Biomark HD. This method provides a high-performance and cost-effective SNP genotyping solution for biomedical and agricultural applications.
Figure S1. Signal-to-noise comparison between rhAmp and TaqMan SNP genotyping. Assays targeting 18 human SNPs were tested with 3 ng of Coriell gDNA from 46 individuals in 5 µL reactions with post-PCR read data collected and analyzed on the QuantStudio 7 Flex Real-Time PCR System software (Thermo Fisher). Signal to noise is determined for each allele by calculating the distance of the homozygous cluster from the no template control (NTC) (average assay NTC signal subtracted from the on-target allele signal from each homozygous sample). Compared to Taqman, rhAmp SNP genotyping results in higher cluster to NTC distance for both alleles in all 18 assays (A), with greater than 2-fold higher average cluster to NTC distance across the 18 assays (B).
Table S1. List of human SNPs from dbSNP Build 144 used in this study.
Table S2. List of human genomic DNA samples used in the study. DNA was obtained from the Coriell Institute for Medical Research and includes samples derived from the three original HapMap populations: Yoruba from Ibadan (YRI), Han Chinese from Beijing, China (CHB) and CEPH Utah residents (CEU).
Table S3. List of rhAmp SNP and TaqMan SNP genotyping assay IDs used in this study.
*These authors contributed equally to this work.