Q-methodology was introduced in 1935 by William Stephenson   to study subjective topics such as attitudes, perceptions, preferences, and feelings. In Q-methodology, subjective topics are studied using a combination of qualitative and quantitative techniques  . It involves development of a sample of statements, called Q-sample, related to the topic of interest and rank-ordering these statements by a group of individuals, study participants, from their points of views or preferences using a grid (known as Q-sort table). This Q-sort table usually has a quasi-normal distribution (see Figure 1) and the completed data from a Q-sort table is known as Q-sort. After completion of Q-sort tables by the participants, a by-person factor analysis (i.e., the factor analysis is performed on persons, not variables or traits) is used to analyze the Q-sorts. Because each participant completes one Q-sort table, each Q-sort represents one individual rather than one variable or trait. Using a by-person factor analysis, similar Q-sorts (individuals) are grouped together as factors. Therefore, each factor represents a group of individuals with similar views, feelings, or preferences about the topic of the study. Statistically, one individual belongs to a factor if his/her factor loading on the factor is statistically significant (p ≤ 0.05). Next, each group (factor) is usually described by a set of statements, called distinguishing statements.
Although Q-methodology was introduced more than 80 years ago, there have not been significant advances in its statistical components since then. Some of these issues have already been discussed elsewhere   . In this article, we review another important statistical issue, i.e. the criteria for identifying distinguishing statements, and suggest appropriate changes.
In the current practice of Q-methodology, a statement is identified as a distinguishing statement for a factor a compared to any other factor x, if
where is called the margin of error or precision of estimate.
However, the point is how to calculate se2 for each factor score. The current
Figure 1. Q-sort table with 19 cells.
Q-programs calculate se2 based on a formula that Stephenson  adapted from Spearman  , i.e. where σ is the standard deviation of the scores for each factor and r (reliability coefficient = correlation of sums for test-retest) is defined as
where p is number of Q-sorts loaded on the factor.
However, this formula adapted from Spearman does not seem to be appropriate in Q-methodology analysis because, first, Spearman’s formula is for correlation of sums or differences of a group of variables on the same subjects, but in Q-methodology Q-sorts on different factors are usually from different subjects and independent from each other. Second, using formula (1) for margin of error is highly dependent on the number of Q-sorts loaded on each factor. Table 1 shows the margin of error for two factors with different number of Q-sorts loaded on each factor and r values based on formula (2). As can be seen, using Stephenson’s formula, factors with smaller numbers of Q-sorts have a larger margin of error which results in smaller numbers of distinguishing statements.
On the other hand, use of a proper statistical test is not applicable for identifying distinguishing statements because in Q-methodology each statement is compared separately between factors. Therefore, from a statistical point of view, sample size is equal to unity, and the margin of error will be relatively large. In addition, in most cases factors are orthogonal and factor scores are supposed to be (at least in theory) independent from each other with a standard deviation (σ) equal to unity. Hence, the correct standard error for the difference between any two factor scores will be
Table 1. Margin of error for identifying distinguishing statements based on number of Q-sorts loaded on factors (p1 and p2) using Stephenson’s formula.
and a statement will be identified as distinguishing if
A margin of error of 2.76 is very large and hardly any distinguishing statement can be found using this criterion.
To overcome the abovementioned problems, we suggest using Cohen’s d (or Cohen’s effect size), which is very popular in psychology and clinical sciences. Cohen’s d  which is defined as
is used to compare the means of two groups based on d as effect size. Cohen suggested an effect size of 0.2 as small, 0.5 as medium, and 0.8 as large.
In this section, an example is provided where the salient viewpoints of a group of individuals who participated in several Q-methodology workshops, using marijuana legalization (ML) as an example, were explored and the distinguishing statements and consensus statements were identified.
Example: 1) Marijuana Legalization
First, to assemble a group of statements for Q-sample the World Wide Web was searched for statements about ML, specifically, to get a sense of supportive and opposing views. We found more than 50 statements and after a review for similarities and differences, 19 representative statements were selected for Q-sample and a Q-sort table was developed (Figure 1). Next, 38 individuals who participated in different Q-methodology workshops completed the Q-sort table. The raw data from these Q-sort tables were entered into Stata and qconvert program  was used to convert raw data into usable data for analysis by qfactor program  .
2) Factor extraction and factor rotation
To identify distinguishing statements, we used both Stephenson’s formula and Cohen’s d criteria. Three factors were extracted using principal axis factoring and varimax rotation. The number of Q-sorts loaded on factors 1, 2, and 3 were 12, 8, and 9, respectively.
3) Distinguishing statements and consensus statements
Using Stephenson’s approach, based on formulas (1) & (2), the margin of error for identifying distinguishing statements for Factor 1 is 0.439 (in comparison with Factor 2) and 0.425 (in comparison with Factor 3). Based on these criteria, the distinguishing statements for Factor 1 are listed in Table 2. Also, distinguishing statements for Factor 1 based on Cohen’s d = 0.80 are listed in Table 3. As we can see in these two tables, because a larger number is set to differentiate between statements’ scores based on Cohen’s d, a smaller number of statements emerged as distinguishing. This pattern was observed for the other two factors as well (the statements are not provided). We also listed the consensus statements
Table 2. Distinguishing statements for Factor 1 based on Stephenson’s formula.
Score ranges from −3 to +3 and negative scores indicate disagreement.
using Stephenson’s formula and Cohen’s d in Table 4. Consensus statements are defined as the statements whose scores are not statistically different between any two factors. Because we used a larger difference to identify distinguishing statements based on Cohen’s effect size, there will be a larger number of statements that will not be statistically significant between factors. Hence, it is not surprising that based this approach, there are more consensus statements than Stephenson’s formula.
Q-methodology was introduced more than 80 years ago to study subjective issues using some statistical techniques. There has not been much change or improvement in the statistical component of Q-methodology since its introduction by Stephenson   as its founding father. Only recently there have been some suggestions for using alternative techniques   other than what has been
Table 3. Distinguishing statements for factor 1 based on Cohen’s d.
Table 4. Consensus statements based on Stephenson’s formula and Cohen’s d.
Score ranges from −3 to +3 and negative scores indicate disagreement.
traditionally available in Q-methodology. In this article, we critically reviewed an important component of Q-methodology, i.e. statistical criteria for identifying distinguishing statements. We showed that Stephenson’s formula  which was adapted from Spearman  is not theoretically appropriate in Q-methodology if multiple Q-sorts are not used from the same subjects. As a matter of fact, we conducted a literature search and found only a handful of studies based on multiple Q-sorts on the same subjects. In addition, number of distinguishing statements based on Stephenson’s formula is highly dependent on the number of Q-sorts loaded on each factor. For identifying distinguishing statements in Q-methodology, statement scores are compared between factors individually; therefore, from a statistical perspective, each statement represents a sample size of unity which requires a margin of error of 2.76. This is a very large margin of error, and hardly any distinguishing statements can be found using this criterion. Instead, we suggest use of Cohen’s d, which is quite popular in health and behavioral research. Cohen’s d is a simple criterion and the resultant number of distinguishing statements is independent of the number of Q-sorts loaded on each factor.