Social and developmental psychology postulates a relationship between both the quality and consistency of parenting practices and psychological adjustment of offspring (Baumrind, 1967; Dadds, Maujean, & Fraser, 2003; Pickering & Sanders, 2016). Parenting practices are specific patterns of actions during parent-child interactions in a given situation (Darling & Steinberg, 1993). Effective parenting practices contribute to psychological and behavioral developmental “outcomes” valuable in western societies (Belsky, 2015; Rasmussen, 2009).
Therefore, reliable and valid measures of parenting effectiveness are important both for clinical and non-clinical research settings (Święcicka et al., 2019). However, in the past, with few exceptions, the most popular measures of parenting examined a narrow range of risk factors related to child misconduct (Dadds et al., 2003). Reviews of parenting measures (Locke & Prinz, 2002) argue that most measures focus on ineffective discipline and parental neglect (Elgar, Waschbusch, Dadds, & Sigvaldason, 2007), or presented a rather questionable psychometric profile (Holden & Edwards, 1989; Locke & Prinz, 2002) as commented by Badahdah and Le (2015). To overcome this problem, the Alabama Parenting Questionnaire was developed (APQ; Frick, 1991; Shelton, Frick, & Wooton, 1996; Frick, Christian, & Wooton, 1999). The questionnaire is among the most frequently used self-report measures of parenting research. Specifically, Google Scholar resulted in more than 430 citations (July 2013; Maguin, Nochajski, De Wit, & Safyer, 2016).
1.1. The Alabama Parenting Questionnaire (APQ-42)
APQ is a multi-method, multi-informant assessment scheme with parallel forms, administered to both children and parents (global report) available also as a phone interview schedule (Essau et al, 2006; Adams, 2015). Parenting behaviors tap five theoretical constructs: Parental Involvement, Positive Parenting, Poor Monitoring/Supervision, Inconsistent Discipline, and Corporal Punishment (Frick et al., 1999). However, previous work suggested a variety of structures with either 3, 4 or 5 factors (Adams, 2015; Maguin, et al., 2016), using mostly EFA (Essau et al. 2006; Badahdah & Le, 2015), CFA (Święcicka et al., 2019) or ESEM (Maguin et al., 2016). More specifically, the APQ Child Global Report has a five-factor structure (Essau et al., 2006), whereas for the Parent Global Report a two, three or four-factor structure emerged (Hawes & Dadds, 2006; Hinshaw et al., 2000; Randolph & Radey, 2011; Molinuevo et al., 2011; Zlomke et al., 2014; Esposito et al., 2016; Maguin et al., 2016). Additionally, the APQ structure was also tested in single-parent family structures (Adams, 2015). However, direct comparisons of the results are challenging due to wide variations in the items used in each study and in child ages of the samples (see also Maguin et al., 2016). APQ has been translated into at least 11 languages (Seabridge, 2012), including German (Essau et al., 2006), Spanish (Molinuevo et al., 2011), Italian (Esposito et al., 2016) Chinese, Arabic (Badahdah & Le, 2015), Ukrainian (Burlaka et al., 2017) and Polish (Święcicka et al., 2019). The APQ-preschool version has been also tested in a sample of hyperactive-inattentive preschool children and controls and three factors emerged (Clerkin et al. 2007; de la Osa et al., 2014). Maguin et al. (2016) examined APQ parenting constructs specific to a special parent population with alcohol-related problems. Internal consistency for the APQ was reported (Frick et al., 1999; Shelton et al., 1996) to range from α = 0.67 - 0.82, except Corporal Punishment (α = 0.37 - 0.46).
1.2. The Alabama Parenting Questionnaire, Short (APQ-9)
However, the need for faster assessment (Gross, Fleming, Mason, & Haggerty, 2015) leads to a 9-item version of the APQ-42 (Elgar et al., 2007). The factor structure of the APQ-42 was examined in a community sample of 1402 parents from Australia (90% mothers). PCA identified 5 factors, however Parallel Analysis (Horn, 1965) and Minimum Average Partial Correlations test (Velicer, 1976) failed to support 2 factors (Parental Involvement and Corporal Punishment), thus a shorter scale (APQ-9) emerged by retaining three factors (Positive Parenting, Inconsistent Discipline, and Poor Supervision) having three items each with the highest loading (Elgar et al., 2007). Factor loadings were 0.77, 0.76, and 0.79 for the Positive Parenting factor, 0.74, 0.63 and 0.74 for the Inconsistent discipline factor and 0.62, 0.75 and 0.65 for Poor supervision. The three factors (explaining 26.31% of the total variance) were highly correlated with their corresponding APQ-42 scale, r = 0.89 (Positive Parenting), r = 0.90 (Inconsistent Discipline) and r = 0.76 (Poor Supervision (ps < 0.01). The item reduction from 42 to 9 was 78.57% (Elgar et al., 2007). The test developers estimated that APQ-9 could be completed in one-fifth of the time in comparison to APQ-42 (<1 minute).
Subsequently, criterion validity and psychometric properties of this shortened version were examined in an independent sample of parents from Canada (1296 mothers and 745 fathers). In this study, the developers of APQ-9 evaluated the validity in differentiating parents of children with behavior disorders and parents of children without behavior disorders. The Conners Parent Rating Scale-Revised (CPRS-R; Conners, Sitarenios, Parker, & Epstein, 1998) was used to evaluate criterion validity. CPRS-R is an 80-item measure of behavioral problems in children of 3 to 17 years. The 3-factor structure emerging in the first study was confirmed with Confirmatory Factor Analysis separately for mothers and fathers with good model fit for mothers, (CFI) = 0.99, NFI = 0.98 and fathers CFI = 0.99, NFI = 0.98. Factor Loadings ranged from 0.52. - 0.82 for mothers and 0.46 - 0.90 for fathers. Factor intercorrelations ranged from −0.24 to 0.30 for mothers and −0.21 to 0.29 for fathers (Elgar et al., 2007). In a later study, the validity of the short-scale was further supported by correlations between parenting practices and child symptoms to a sample of 133 parents (90.98% mothers) of 5- to 18-year-old children (Elgar et al., 2007).
Internal consistency reliability of the APQ-9 factors ranged from 0.59 - 0.79 for mothers and 0.63 - 0.84 for fathers. The internal consistency of the APQ in the third sample was moderate, ranging from α = 0.57 (Positive Parenting) to α = 0.62 (Inconsistent Discipline). Reliability per age varied for children aged 4 to 9 years, mean α = 0.44; for children aged 5 to 12 years, α = 0.59 to 0.84 and for children aged 5 to 18 years, α = 0.57 to 0.61 (Elgar et al., 2007 as summarized by Gross et al., 2015). Later, Gross et al. (2015) examined the longitudinal invariance of the APQ-9 for parents and youngsters, and the multigroup invariance between parents and adolescents during their transition from middle school to high school.
1.3. The Present Study
The purpose of this study is to examine the factor structure of APQ-9 using EFA and CFA in a Greek sample of parents of the general population with children from 7 - 13 years. To this end, the study had also the following goals: 1) to evaluate measurement invariance across child gender; 2) to build evidence of convergent and discriminant validity of APQ-9 based on the CFA Multitrait-Multimethod method (CFA MTMM); 3) to reinforce convergent and discriminant validity with correlation analysis; 4) to evaluate internal consistency reliability (with α), model-based reliability (with ω), model-based convergent validity (with AVE) and finally, 5) to calculate normative data for the mean factor scores.
The sample comprised 621 Greek parents (75% females) with at least one child from 7 to 13 years (M = 10.23 years, SD = 2.11, 54% females). The parents (72% biological mothers, biological 24% fathers, 4% other) had one child (32%), two (48%), three (15%) or more children (5%). More than half of the parents (54%) were from 41 - 50 years old, 28% from 31 - 40 years, 10% from 51 - 60, 7% from 21 - 30 and 1% were over 60 years. Less than half of the participants (39%) had a B.A. or higher (20%), or they had finished high-school (36%) or lower (5%). Most participants (38%) had an annual income between 10,001?and 20,000?or lower (21%) while 25% had an income 20,001?- 30,000?or higher (16%).
Alabama Parenting Questionnaire—Short Form (APQ-9, Elgar et al., 2007)
This nine-item short form of the original APQ-42 (Frick, 1991; Shelton et al., 1996; Frick et al., 1999) is designed to assess parenting practices related to disruptive behaviors (Shelton et al., 1996). It was shortened for faster assessment (Gross et al., 2015). APQ-9 items (e.g. You threaten to punish your child and then do not actually punish him/her) are rated on a 5 point Likert Scale (1 = never; 2 = almost never; 3 = sometimes; 4 = often; 5 = always). Higher scores indicate higher ratings of the measured parenting practice (i.e. Positive Parenting, Inconsistent Discipline, Poor Supervision).
APQ-9 Translation procedure. APQ-9 was translated in Greek using the translation-back-translation method (Brislin, 1970). First, it was translated in Greek by the first author. Back-translation to English followed by a bilingual psychologist, not familiar with the English version. All items of the original English and the back-translated version went through an iterative process of translation/ back-translation (3 times) to eliminate differences or ambiguities before the final version.
Kansas Parental Satisfaction Scale (KPSS, James, Schumm, Kennedy, Grigsby, Shectman, & Nichols, 1985)
KPSS is a 3-item scale measuring parental satisfaction with the following: 1) children, 2) parenting role, and 3) parent-child relationship. Items are rated on a 7-point Likert scale (1 = extremely dissatisfied, 7 = extremely satisfied) and aggregated to a total score ranging from 3 (minimum satisfaction) to 21 (maximum satisfaction). An EFA was carried out in the current ample. Kaiser-Meyer-Olkin measure of sampling adequacy (Kaiser, 1970, 1974) was 0.71, and Bartlett’s test of sphericity (Bartlett, 1954) was significant (χ2(3) = 687.06, p < 0.001). A single parent satisfaction factor emerged (PAF extraction, Obilin rotation) explaining a total variance of 61.28%. Factor loadings for items 1 - 3 were 0.80, 0.69 and 0.85 and communalities 0.64, 0.48, 0.72 (Kyriazos & Stalikas, 2019e). The internal consistency reliability of the factor was α = 0.82. The KPSS has been reported having internal consistency reliability from 0.78 to 0.95 (Nitsch et al., 2015).
Parenting Behaviours and Dimensions Questionnaire (PBDQ; Reid, Roberts, Roberts, & Piek, 2015)
PBDC is a scale of parental behaviors containing 33 items on six factors (Emotional Warmth, Punitive Discipline, Autonomy Support, Permissive Discipline, Anxious Intrusiveness, Democratic Discipline). All items (e.g. I try to meet my child’s desires immediately) rate the frequency of behaviors on a 6-point Likert scale, from 1 (“never”) to 6 (“always”). The score is calculated based on factor means. The fit of this 6-factor model to this sample was adequate, χ2(465) = 826.86, χ2/df = 1.78, RMSEA = 0.042, CFI = 0.922, TLI = 0.912, SRMR = 0.071 (Kyriazos & Stalikas, 2019a). Internal consistency reliability per factor in this study was α = 0.85 (Emotional Warmth), α = 0.82 (Punitive Discipline), α = 0.77 (Anxious Intrusiveness), α = 0.79 (Autonomy Support), α = 0.69 (Permissive Discipline), α = 0.76 (Democratic Discipline). The PBDQ developers reported an alpha coefficient ranging from 0.66 to 0.83 (Reid et al., 2015).
Parent Behavior Inventory (PBI; Lovejoy, Weis, O’Hare, & Rubin, 1999)
PBI is a 20-item measure of parenting practices. Items (e.g. I threaten my child) are rated on a 5-point Likert scale ranging from 1 (“not at all true” or “I do not do this”) to 5 (“very true” or “I often do this”). Higher scores indicate a higher frequency of the rated practice. Items are divided in two factors, the hostile/coercive factor and the and the supportive/engaged factor. This factor structure was tested in the current sample and showed an adequate fit, χ2(159) = 322.77, χ2/df = 2.03, RMSEA = 0.049, CFI = 0.925, TLI = 0.911, SRMR = 0.069 (Kyriazos & Stalikas, 2019b). In this study, internal consistency reliability for the supportive/engaged factor was α = 0.86, and for the hostile/coercive factor α= 0.81. Lovejoy et al., (1999) reported an alpha coefficient of 0.83 and 0.81 for the supportive/engaged parenting and hostile/coercive parenting factor respectively.
Parent Concerns Questionnaire (PCQ; Sheppard, 2010)
PCQ is a 37-item measure of child development or parental problems (Sheppard, 2010). PCQ has three domains (parenting capacity, child development, family/environmental factors). Each item (e.g. I/we are rather too critical of my children) is rated on a 3-point scale (0 = not present, 1 = present, and 2 = severe), producing an aggregated score. Problems perceived by the respondent as “severe” may suggest that professional intervention is required. In the current study this 3-dimensional theoretical model was verified with CFA, χ2(30) = 57.76, χ2/df = 1.93, RMSEA = 0.046, CFI = 0.965, TLI = 0.947, SRMR = 0.041 (Kyriazos & Stalikas, 2019c). Factor 1 (child development problems) contained items 24, 25, 29, Factor 2 (Parenting Capacity problems) items 34, 35, 36, and Factor 3 (family/environmental problems) contained items 4, 10, 11, 12 (Kyriazos & Stalikas, 2019c). The alphas per factor of this 10-item structure were 0.76, 0.71 and 0.77 for factors 1 - 3 respectively. Sheppard (2010) reported alpha coefficients of 0.89, 0.79 and 0.73 for the Child Development problems, Parenting Capacity problems and Family/Environmental problems respectively.
Parental Stress Scale (PSS; Berry & Jones, 1995)
PSS is a self-report questionnaire of perceived stress of the parental experience. All 20 items (e.g. The major source of stress in my life is my child) are rated on a 5-point Likert scale (from 1 = “strongly disagree” to 5 = “strongly agree”). Higher ratings suggest higher parental stress. Items can be arranged in two major domains (positive and stressful parenting themes). Berry and Jones (1995: p. 470) found a 4-factor structure to “support the dichotomy of the parenting experience and the theoretical bases of the Parental Stress Scale”. This theoretical dichotomy of the PSS structure was confirmed with CFA, χ2(72) = 148.86, χ2/df = 2.07, RMSEA = 0.050, CFI = 0.951, TLI = 0.938, SRMR = 0.062 (Kyriazos & Stalikas, 2019d). Factor 1 (Positive Parenting Themes) comprised items 1, 5, 6, 7, 8, 17, 18 and Factor 2 (Stressful Parenting Themes) comprised items 3, 4, 10, 11, 12, 15, 16. The internal consistency reliability for these two factors was α = 0.87 for positive parenting themes (reversed scored) and α = 0.76 for stressful parenting themes. Berry & Jones (1995) reported a total alpha coefficient of 0.83.
Data were collected with the assistance of psychology students. Specifically, about 100 students forwarded a link of the study to at least 5 parents in their social environment (M = 6.21), inviting them to participate in the study. During the data collection, all parents the students recruited, first read a digital description of the study, accepting an inform consent. Then they specified a personal code to ensure anonymity. Students received extra credit for carrying out the recruitment process.
2.4. Research Design
The sample was split in two (about 1/3 and 2/3, Guadagnoli & Velicer, 1988). The EFA subsample was 30% and the CFA subsample was 70%. A CFA followed the EFA. After CFA, additional analyses were performed in the optimal CFA model: 1) full measurement invariance to the strict level (highest possible, Wang & Wang, 2012); 2) Internal consistency reliability using Cronbach’s alpha coefficient (1951) and model-based reliability (Mair, 2018; Sha & Ackerman, 2018) using Bollen’s Omega (Bollen, 1980;
Data were collected electronically on Google FormsÒ and were analyzed with R software (R Development Core Team, 2019) with the following packages: “haven” V 2.1.1 (Wickham, 2019a), “psych” V1.8.12 (Revelle, 2019), “lavaan” V0.6-4 (see Rosseel, 2012), “MVN” 5.7 (Korkmaz, 2019), “caret” v6.0-84 (Kuhn, 2019), “knitr” V1.23 (Xie, 2019), “dplyr” v0.7.8 (Wickham, 2019a), “tidyr” v0.8.3 (Wickham, 2019b), semPlot v1.1.1 (Epskamp, 2019), “semTools” v0.5-1 (Jorgensen, 2019).
Data contained no missing values because all the fields of the digital test-battery were set as “required” to eliminate non-response. Twenty-six out of 621 cases were identified as multivariate outliers, with scores exceeding the critical value χ2  = 27.88, p < 0.001 for Mahalanobis distance (Mahalanobis, 1936; Tabachnick & Fidell, 2013). However, outliers did not alter results so they were included in the dataset. The final sample was N = 621 cases. The sample was randomly split in two subsamples (nEFA = 187 and nCFA = 434). The cases to measured variables ratios for nEFA and nCFA (Costello & Osborne, 2005; Ullman, 2013) were 22.78 and 48.22 respectively. The cases to estimated parameters ratio (see Schumacker & Lomax, 2016) for the hypothesized CFA model (Elgar et al., 2007) was 9.64. Power analysis based on population RMSEA (MacCallum, Browne, & Sugawara, 1996) recommended a CFA sample size ≥ 375 cases (0 = 0.05, α = 0.08, df = 24, 1 − β = 0.80).
3.1. Univariate and Multivariate Normality
The assumption of univariate normality was examined in the whole data set (N = 621) with Kolmogorov-Smirnov, Shapiro-Wilk, Shapiro-Francia, and Anderson-Darlingall tests and they were statistically significant (p < 0.001) for all measured variables (Table 1). Multivariate normality was examined with Mardia’s multivariate kurtosis test (Mardia, 1970), Mardia’s multivariate skewness test (Mardia, 1970), Henze-Zirkler’s consistent test (Henze & Zirkler, 1990), Doornik-Hansen omnibus test (Doornik & Hansen, 2008), E-statistic and Roston test. The multivariate normality tests were significant, p < 0.001 for all samples (Total, EFA and CFA) as presented in Table 1.
3.2. Exploratory Factor Analysis (nEFA = 187)
Initially, the factorability of the correlation matrix was evaluated (Tabachnick & Fidell, 2013). All APQ items correlated ≥0.30 with at least a second item. Kaiser-Meyer-Olkin measure of sampling adequacy (Kaiser, 1970, 1974) was 0.69, and Bartlett’s test of sphericity (Bartlett, 1954) was significant (χ2(36) = 454.42, p < 0.01). The anti-image correlation matrix diagonals were >0.50. Given the above factorability indications, EFA was carried out with all nine items.
Factors were extracted with Principal Axis Factoring and oblique rotation (Oblimin). The number of factors to retain was determined with the following methods: the scree plot (Cattell, 1966), Parallel Analysis (PA; Horn, 1965), Very Simple Structure (VSS; Revelle & Rocklin, 1979), Minimum Average Partial Correlations (MAP; Velicer, 1976), and the goodness of model fit. Model fit was evaluated with the Root Mean Square Error of Approximation (RMSEA;
Table 1. Descriptive Statistics and univariate normality tests for each APQ-9 measured variable along with Multivariate Normality Tests for the total sample and subsamples.
Note. All univariate and multivariate normality tests were significant at p < 0.001 level.
Steiger & Lind, 1980), Root Mean Square of Residuals (RMSR), Comparative Fit Index (CFI; Bentler, 1990), Tucker-Lewis Index (TLI; Tucker & Lewis, 1973) and Bayesian information criterion (BIC; Schwartz, 1978). Fit criteria (Hu & Bentler, 1999; Browne & Cudeck, 1993) were RMSEA ≤ 0.06 [90% Confidence Intervals ≤ 0.06], RMSR ≤ 0.0448 (Kelley’s criterion; Kelley, 1935; Harman, 1962; Lorezo-Seva & Ferrando, 2013) CFI and TLI ≥ 0.95, and lowest possible BIC
PA (see Figure 1) suggested three factors. VSS complexity 1 achieved a maximum of 0.72 with 2 factors and complexity 2 achieved a maximum of 0.81 with 4 factors. MAP achieved a minimum of 0.05 with 1 factor. BIC reached a minimum with 3 factors and Sample Size adjusted BIC achieved a minimum with 4 factors. Taking into account the joined findings of the above methods, 3 factors were extracted (total explained variance of 65.11%). The Extraction Sums of Squared Loadings suggested that the first factor explained 35.44% of the variance, the second 19.11% of the variance, and the third factor 10.56% of the variance with communalities > 0.30. The fit of this model was adequate, RMSR = 0.03, TLI = 0.923, RMSEA = 0.072 [90% CI 0.021, 0.112] and BIC = −40.09. Regarding item allocation to the extracted factors, items 1, 6 and 7 loaded on the first factor (Positive Parenting) with loadings ranging from 0.513 to 0.862, items 2, 4, and 9 loaded on the second factor (Inconsistent Disciple), with loadings from 0.465 to 0.767. Items 3, 5, 8 loaded on the third factor (Poor Supervision) with loadings ranging from 0.640 to 0.777. Table 2 contains the APQ-9 factor loadings above 0.30 and factor inter-correlations (also presented in Figure 2).
3.3. Confirmatory Factor Analysis (nCFA = 434)
CFA was carried out with the Robust Maximum Likelihood estimator (MLR; see Yuan & Bentler, 2000). Goodness of model fit was evaluated by the RMSEA ≤ 0.06, RMSEA 90% CI ≤ 0.06, SRMR ≤ 0.08, CFI ≥ 0.95, TLI ≥ 0.95 (Hu & Bentler, 1999; Browne & Cudeck, 1993; Brown, 2015), and Chi-square/df ratio < 3
Figure 1. Scree plots of actual and simulated data.
Table 2. EFA factor loadings, communalities and factor Inter-correlations for the APQ-9.
Note. Extraction = PAF, Rotation = Oblimin. Loadings < 0.30 were excluded.
Figure 2. Factor Loadings of each factor.
Three models were tested: (A) a single-factor model with all nine items in a single factor to test the maximum parsimony hypothesis (Brown, 2015); (B) a first-order, Independent Cluster Model (ICM-CFA; Marsh et al., 2014; Howard et al., 2016) with two correlated factors examined (but not proposed) by Elgar et al., (2007). This model had the original PP factor and a second factor with all the non-positive-parenting items (2, 4, 9, 3, 5, 8); (C) the first order ICM-CFA model with three correlated factors proposed by Elgar et al. (2007). Regarding the model fit, the hypothesis of maximum parsimony was rejected (MODEL A). The two-factor ICM-CFA model also performed poorly (MODEL B). The 3-factor model (MODEL C) had adequate fit, with all fit statistics and factor loadings within acceptable limits. The fit statistics and the standardized loadings of all models are presented in Table 3 and the path of this optimal model in Figure 3. A second-order 3-factor Bifactor model (Harman, 1976; Holzinger & Swineford, 1937) was also tested but it failed to converge. This model had PP, ID and PS items in three specific factors tapping simultaneously in a general factor.
3.4. Measurement Invariance
The configural, weak, strong and strict full measurement invariance were evaluated across the gender of the child, the 621 parents had completed the APQ-9 for. The nested models were compared using the cutoffs of ΔCFI ≤ 0.01 (Cheung & Rensvold, 2002; Chen, 2007) and ΔRMSEA ≤ 0.015 (Chen, 2007). The 3-factor optimal solution was tested separately for each child-gender (Table 4). These models showed an adequate fit both for girls (N = 337) and for boys (N = 284). Nested invariance models (1 - 4) also fit the data well (Table 5). The weak to configural model comparison and the strong to weak model comparison yielded ΔCFIs and ΔRMSEAs below the cutoffs of non-invariance. However, in the strict to strong model comparison, only the ΔRMSEA cutoff supported invariance.
3.5. Internal Consistency Reliability, Model-Based Reliability, and Validity
Cronbach’s alpha ≥ 0.70 is generally acceptable (Hair et al., 2010). Omega values
Table 3. Goodness of fit measures, factor loadings and Inter-correlations for the APQ models specified in the CFA.
Note. *p < 0.01. Estimator = MLR; Bold typeface indicates optimal fit. df = Degrees of freedom; CFI = Comparative fit index; TLI = Tucker-Lewis index; RMSEA = Root mean square error of approximation; CI = Confidence interval; SRMR= Standardized root mean square residual. FI = Factor 1 (items 1, 6, 7), F2 = Factor 2 (items 2, 4, 9), F3= Factor 3 (items 3, 5, 8).
Table 4. Goodness-of-fit measures for the baseline model for testing measurement invariance across child gender for the 3-factor APQ-9 model.
Note. Estimator = MLR.
Table 5. Goodness-of-Fit measures for the nested APQ-9 models to validate full measurement invariance across the child gender of the parents.
Note. Estimator = MLR.
Figure 3. Path diagram of the optimal CFA solution for the APQ-9. Conventionally, cycles are latent factors, rectangles represent manifest variables.
The internal consistency reliability of the APQ-9 PP, ID and PS scales was estimated in the total sample. Cronbach’s α coefficients ranged from 0.61 - 0.68 (Table 6). On average ω coefficients ranged from 0.64 - 0.65, 0.68 and 0.62 for the PP, ID and PS scales respectively. AVE ranged from 0.35 - 0.41 (Table 6).
3.6. Convergent and Discriminant Validity with CFA Multitrait-Multimethod Model (CFA MTMM)
The hypothesized Correlated Traits/Correlated Methods model (Model 1-CTCM, Table 7) was compared to three alternatives commonly used MTMM Models (Byrne, 2012): No Traits/Correlated Methods (Model 2-NTCM), Perfectly Correlated Traits/Freely Correlated Methods (Model 3-PCTCM) and Freely Correlated Traits/Uncorrelated Methods (Model 4-CTUM). The CFA MTMM model was parameterized with 3 Traits and 3 Methods. The 3 Traits were composed by 1) Positive Parenting containing the APQ Positive Parenting factor (items 1, 6, 7) and PBDQ Emotional Warmth factor (items 1, 2, 3, 4, 5, 6); 2) Inconsistent Discipline that contained the APQ Inconsistent Discipline factor (items 4, 9) and PBI Hostile/Coercive Parenting factor (items 1, 3, 5, 7, 9, 13, 15, 17, 19, 20); 3) Poor Supervision that contained the APQ Poor Supervision factor (items 3, 5, 8) and PBDQ Punitive Discipline factor (items 7, 8, 9, 10, 11). The 3 methods comprised: 1) Alabama Parenting Questionnaire (items 1, 3, 4, 5, 6, 7, 8, 9); 2)
Table 6. Internal consistency reliability and model-based reliability and validity for the three APQ-9 scales in the optimal CFA model.
Note. PP = items 1, 6, 7, ID = items 2, 4, 9, PS = items 3, 5, 8.
Table 7. Goodness-of-Fit measures of CFA MTMM models specified for the APQ-9.
Note. Estimator = MLR.
Parent Behavior Inventory (items 1, 3, 5, 7, 9, 13, 15, 17, 19, 20) and 3) Parenting Behaviours & Dimensions Questionnaire (items 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11). The Δχ2 test (based on MLR) and the ΔCFI criteria were used to compare the fit difference of the nested models (Cheung & Rensvold, 2002; Byrne, 2010, 2012).
The fit of the baseline model (MODEL 1, CTCM) to the data was good. The fit of the rest MTMM models is presented in Table 7. Regarding model comparison, the Δχ2 was highly significant (p < 0.0001), and the fit difference (ΔCFI) supported the traits convergent and discriminant validity. The findings of methods discriminant validity were conflicting. While ΔCFI (0.023) was within acceptable limits (Byrne, 2010: p. 291), Δχ2 was statistically significant. All the results of model comparisons are summarized in Table 8. The factor loadings of the CTCM Model are presented in Table 9. The path diagrams of the CFA MTMM model are presented in Figure 4.
3.7. Convergent and Discriminant Validity with Correlation Analysis
The validation measures were arranged in two groups: Positive and Non-Positive Parenting Practices (Table 10). PP was positively correlated (at p < 0.01 level) with the scales in the Positive Parenting Practices Group at a magnitude ranging from rS(619) = 0.17, p < 0.01 (Kansas Parental Satisfaction Scale) to rS(619) = 0.29, p < 0.01 (PBDQ Autonomy Support and PBDQ Democratic Discipline). PP was low to moderately correlated with the scales in the Non-positive Parenting Practices Group from rS(619) = 0.11, p < 0.01 (PBDQ Anxious Intrusiveness) to rS(619) = −0.12, p < 0.01 (PBDQ Punitive Discipline). ID showed statistically significant, negative correlations with the Positive Parenting Practices Group ranging from rS(619) = −0.12, p < 0.01 (PSS Positive Parenting Themes) to rS(619) = −0.18, p < 0.01 (PBDQ Autonomy Support) and positive correlations with the scales of Non-Positive Parenting Practices Group varying from rS(619) = 0.03, ns (PCQ Family/Environmental problems) to rS(619) = 0.43, p < 0.01 (PBDQ Punitive Discipline). Similarly, PS showed low to moderate negative correlations with all scales contained in the Positive Parenting Practices Group from rS(619) = −0.20, p < 0.01 (KPSS) to rS(619) = −0.29, p < 0.01 (PBDQ Emotional Warmth). PS showed positive (with one exception), low to moderate correlations
Table 8. Differential goodness-of-fit statistics for CFA MTMM nested model comparisons.
Note. ΔChi-Square was based on MLR estimator.
Figure 4. CFA MTMM model (Model 1-CTCM): Path diagram of the correlated traits (latent variables in lowercase)/correlated methods (latent variables in uppercase).
Table 9. Factor loadings of the CFA MTMM.
Table 10. Bivariate correlations of APQ-9 with validation scales.
Note. **Significant at p < 0.01 level. *Significant at p < 0.5 level.
with the scales of Non-Positive Parenting Practices Group, from rS(619) = −0.08, ns (PBDQ Anxious Intrusiveness) to rS(619) = 0.23, p < 0.01 (PBDQ Punitive Discipline). All correlations are presented in Table 10.
3.8. Descriptive Statistics and Normative Data
APQ-9 factor scores for PP, ID and PS factors were M = 4.48 (SD = 0.71), M = 2.73 (SD = 0.81), and M = 1.54 (SD = 0.76) respectively. The 10th, 25th, 50th, 75th and 90th percentile of the factor scores were calculated (N = 621). For PP, ID, and PS, 50% of the respondents had M ≤ 4.67, ≤2.67 and ≤1.33 respectively. For each APQ-9 measured variable the highest means were observed on item 6 (M = 4.66, SD = 0.75) and 1 (M = 4.44, SD = 1.01), equivalent to often—always Likert points. The lowest mean was found on item 3 (M = 1.77, SD = 1.18 (or never—almost never). All percentile means are presented in Table 11 and the measured variables means were presented in Table 1.
Regarding the correlations of the APQ-9 factors, the correlation of PP with ID was rS(619) = 0.01, ns. The correlation of PP with PS was rS(619) = −0.23, p < 0.01. Finally, the correlation of ID with PS was rS(619) = −0.20, p < 0.01.
The purpose of this study was to evaluate the factor structure of APQ-9 in a Greek sample of the general population with EFA and CFA. The aim of the study was also: 1) to examine measurement invariance; 2) to evaluate convergent and discriminant validity of APQ-9 based on CFA Multitrait Multimethod Matrix (CFA MTMM); 3) to examine convergent and discriminant validity further with correlation analysis; 4) to estimate internal consistency (with coefficient alpha Cronbach, 1951), model-based reliability (with coefficient omega, McDonald, 1999, 1970), and model-based convergent validity (using Average Variance
Table 11. Percentiles of the APQ-9 factor means.
Extracted/AVE, Fornell & Larcker, 1981), finally 5) to calculate normative data for the mean factor scores.
The sample was recruited using a variation of the network sampling method (APA, 2014), with the difference that those who recruited volunteers did not participate in the sample themselves. The sample was randomly divided into two subsamples. EFA was carried out in the first subsample and CFA followed in the second one. Sample-splitting (Guadagnoli & Velicer, 1988; MacCallum, Browne, & Sugawara, 1996) is considered a construct validity cross-validation method (Byrne, 2012; Brown, 2015; see also Kyriazos, 2018a, 2018b). Sample to measured variables ratios was higher than the proposed minimums for both the EFA (Costello & Osborne, 2005) and the CFA subsample (Bentler & Chou, 1987; Bollen, 1989). The CFA sample to estimated parameters ratio was also higher than the proposed minimums of adequacy (Kline, 2016). A post hoc estimation of CFA sample power (Wang, Watts, Anderson, & Little, 2013) suggested that sample size was larger than the proposed CFA sample at 80% probability level for rejecting a false null hypothesis (Cohen, 1988, 1992).
Moving to research findings, EFA factorability of the correlation matrix was evaluated with multiple methods and they suggested satisfactory factorability. The three factors were extracted with Principal Axis Factoring method and an oblique rotation because of the APQ-9 factor correlations. The number of factors to retain was three. The fit of this 3-factor model was good using multiple fit indicators (Brown, 2015). Communalities suggested that the shared common variance of the items was adequate. All the factor loadings were good forming three robust factors (Positive Parenting, Inconsistent Discipline, and Poor Supervision) with no cross-loadings. This EFA solution verified the structure originally proposed both by Elgar et al. (2007) subsequently by Gross et al. (2015) in a longitudinal study.
CFA followed in the second subsample with the evaluation of three alternative models. The fit was evaluated adopting the multiple assessment approaches (Bentler & Bonett, 1980), for more conservative results (Brown, 2015). Apart from the commonly accepted goodness of fit statistics, the chi-square/df ratio was calculated, although it received criticism (e.g. Kline, 2016) because its inclusion is a common practice. All chi-square-based criteria used were interpreted in tandem with the rest fit indicators as a result of chi-square over-sensitivity to samples n > 200 (Little, 2013; see Kyriazos, 2018b). A CFA Bifactor model (Harman, 1976; Holzinger & Swineford, 1937) was also specified. Generally, testing a Bifactor structure is considered good practice (Hammer & Toland, 2016). Unfortunately, the Bifactor model failed to converge and it lacked a theoretical background to attempt troubleshooting the convergence problem with recommended solutions (Byrne, 2012; Heck & Thomas, 2015). We could not test a higher-order model either, because of the inherent under-identification problems for m ≤ 3 (e.g. Wang & Wang, 2012). After examining the combined evidence of model fit, factor loadings and factor inter-correlations, the 3-factor model with correlated factors was the optimal solution. This finding confirmed both the preceding EFA model and the structures proposed in the literature (Elgar et al., 2007; Gross et al., 2015). The factor loadings and inter-correlations of this optimal 3-factor solution were satisfactory and comparable to those of the APQ-9 model propose by Elgar et al. (2007). Additionally, three factors are consistent for APQ-42 validation studies (Hinshaw et al., 2000; Randolph & Radey, 2011; Zlomke et al., 2014; Molinuevo et al., 2011), except for Robert (2009) and Święcicka et al. (2019) who extracted five factors and Zlomke et al. (2014) who found four factors (see Maguin et al., 2016). However, interpreting these results is complicated by the variation of the allocation of the measured variables to factors (Maguin et al., 2016; Esposito et al., 2016).
APQ-9 measurement invariance across child gender was evaluated in the total sample using the three-factor model as a baseline model. Full invariance was examined to the strict level, i.e. the strictest possible measurement invariance level (Wang & Wang, 2012). The comparison of the nested models showed that configural, Weak and Strong invariance were fully supported and Strict invariance was partially supported. Actually, this level is often hard to establish in practice (Timmons, 2010). Thus, factor structure factor loadings and indicator means can be safely compared between parents that either care for a girl or a boy. However, indicator residuals comparisons between parents of girls and parents of boys must be made cautiously. Generally, the heterogeneity of the existing studies, along with the lack of reported results details blur the assessments of invariance across samples (Maguin et al., 2016) and family types (Adams, 2015).
Convergent and discriminant validity of APQ-9 parenting practices were evaluated with the CFA Multitrait-Multimethod method (Widaman, 1985), using three traits and 3 methods. Findings suggested strong tenability for the traits convergent and discriminant validity, and less strong for methods discriminant validity, as expected based on methods used. Convergent and discriminant validity were also examined with correlations of APQ-9 with five validity measures having 13 dimensions were examined. The validity measures were arranged in two broad categories: 1) Positive parenting practices and 2) Negative parenting practices. A fairly consistent pattern or relationships emerged for all three APQ-9 factors, in agreement with the existing literature (Elgar et al., 2007; Gross et al., 2015 and Dadds et al., 2003 for the original APQ). As expected, APQ-9 Positive Parenting Scale consistently showed almost the opposite pattern of relationships, in comparison to the pattern of relationships of Inconsistent Discipline and Poor Supervision Scales. Almost all relationships were statistically significant with low to moderate magnitude, abiding by the criteria specified by Cohen (1988, 1992). The strength of associations is discussed in parenting literature (e.g. Seabridge, 2012; Hershkowitz et al., 2017; Burlaka et al., 2017).
Lastly, given the violation of the normality assumption, percentiles, factor means, and item means were also calculated. The findings were also comparable to the values of the original APQ-9 (Elgar et al., 2007). Future research directions could include the comparison of different models for mothers and fathers, measurement invariance in other demographics like parent age, or gender. Longitudinal measurement invariance could be also tested to replicate Gross et al., (2015) findings. The present solution could be examined in children older than 13 years. Additionally, multi-cultural studies are necessary to assess measurement invariance further. Likewise, assessments of invariance under demographic variation are also needed (Maguin et al., 2016).
Finally, the sample size didn’t allow the full implementation of the 3-faced construct validation method (Kyriazos, 2018a; Kyriazos, Stalikas, Prassa, & Yotsidi, 2018). Anyhow, the findings of this study—in line with literature demands for shorter assessment (Scott, Briskman, & Dadds, 2011; Gross et al., 2015)—make the use of APQ-9 more reliable for use in future parenting interventions in Greece and provide normative data for professionals.