Depression is the most common psychological disorder across the globe (WHO, 2017) and has been proved to detrimentally influence social functioning and quality of life (Kawada, Kuratomi, & Kanai, 2011) , while it appears to result in high suicide rates (Frasure-Smith, Lesperance, & Talajic, 1993; OECD, 2015) .
Given that anxiety and depression can profoundly affect individuals, the design and development of tools that can be utilized to effectively detect signs in the early stages of such debilitating diseases is indispensable (WHO, 2001) . According to Clark & Watson (1991) , despite the fact that these two psychological states are conceptually distinct, they have several characteristics in common and demonstrate a high degree of overlap that makes it difficult to differentiate in screening. To distinguish the two constructs, Clark & Watson proposed a tripartite model of anxiety and depression. This model suggested that general distress and negative affectivity are common to anxiety and depression, physiological hyperarousal is associated with anxiety, and an absence of positive affect is linked to depression. Negative affectivity that is shared by anxiety and depression can explain the high comorbidity between these two psychological states. Some years later, in a research conducted by Lovibond & Lovibond (1993) with the ultimate goal of creating a 2-scale questionnaire consisting of anxiety and depression, a third factor (i.e. stress) emerged from their analysis describing irritability and agitation and leading to the development of the Depression Anxiety Stress Scale (DASS).
The Depression Anxiety Stress Scale is a widely-used self-reported scale developed by Lovibond & Lovibond (1995a) to assess symptoms of depression, anxiety, and stress. The questionnaire cannot be viewed as a diagnostic measurement but rather as a screening tool that enables researchers to gauge levels of all three emotional states concurrently. Nevertheless, for a diagnosis to be made, structured interview is necessary to detect cases of depression and anxiety based on the Diagnostic and Statistical Manual of Mental Disorders (DSM).
The scale has been broadly investigated for its psychometric properties across several settings and nations including Australia (Lovibond & Lovibond, 1995a) , England (Henry & Crawford, 2005) , Canada (Clara, Cox, & Enns, 2001; Antony et al., 1998) , and the United States of America (Brown et al., 1997) , and has been translated and validated in more than 35 languages such as Spanish (Bados, Solanas, & Andres, 2005) , Malay (Musa, Fradzil, & Zain, 2007; Ramli, Ariff, & Zaini, 2007; Rusli et al., 2017; Ramli, Salmian, & Nurul, 2009; Nur Azma et al., 2014; Edimansyah, Rusli, & Naing, 2007) , Chinese (Chan et al., 2012) , Portuguese (Apόstolo, Mendes, & Azeredo, 2006) , Greek (Lyrakos et al., 2011) , Dutch (Nieuwenhuijsen et al., 2003) , Italian (Severino & Haynes, 2010) , Korean (Deokhoon et al., 2018) , Turkish (Akin & Cetin, 2007) , and Albanian (Basha & Kaya, 2016) . Both English and non-English versions have high internal consistency (α > .70).
The full-length DASS (DASS-42) in its initial format comprises 42 items and 3 subscales: the Depression subscale (DASS-D), the Anxiety subscale (DASS-A), and the Stress subscale (DASS-S). The Depression subscale measures low self-esteem, dysphoria, lack of interest, displeasure, sense of hopelessness, devaluation of life, self-deprecation, low positive affect, lack of interest or involvement, anhedonia, and inertia. The Anxiety subscale assesses autonomic arousal, fearfulness, skeletal musculature affects, situational anxiety, and subjective experience of anxiety and panic. The Stress subscale measures lack of relaxation, nervous arousal, agitation, ease of becoming upset, irritability, negative affect and impatience.
Lovibond & Lovibond (1995a) , in the original scale validation study reported internal consistencies (coefficient alpha) equal to .91 for Depression, .84 for Anxiety, and .90 for Stress and later .88 for Depression, .82 for Anxiety, .90 for Stress, and .93 for the total scale (Lovibond & Lovibond, 1995b) . Since then, from a plethora of studies has been inferred that DASS-42 has an excellent internal consistency (Brown et al., 1997; Antony et al., 1998; Crawford & Henry, 2003; Henry & Crawford, 2005; Page, Hooke, & Morrison, 2007; Nieuwenhuijsen et al., 2003; Daza et al., 2002) . As far as the scale’s correlations with other measures of anxiety and depression is concerned, DASS-42 demonstrates a satisfactory convergent and an acceptable divergent validity (Brown et al., 1997; Crawford & Henry, 2003) . In terms of discriminant validity, the three DASS-42 subscales have been reported to differentiate between clinical and non-clinical samples. Exploratory factor analyses have supported the questionnaire’s three-factor structure both in clinical and non-clinical samples; nevertheless, the degree of fit in confirmatory factor analyses has not been proved to be adequate (Lovibond & Lovibond, 1995b; Antony et al., 1998; Clara, Cox, & Enns, 2001) .
The DASS-21 is a modified and shortened version of the instrument that assesses the same domains and has the same structure as the original full version but it requires half the time to complete, thus it is easily applicable and much quicker to administer. The DASS-21 comprises seven representative items for each subscale, selected from the original questionnaire. In the majority of studies, DASS-21 has demonstrated adequate psychometric properties in terms of internal consistency and validity, suggesting that the scale can be used among diverse groups. In the original study conducted on a non-clinical sample, internal consistency reliability coefficients (Cronbach’s alpha) for each subscale were .81 for Depression, .73 for Anxiety, and .81 for Stress. The scale has been widely used in an abundance of studies to assess the symptoms of psychological distress among both clinical (e.g., Bottesi et al., 2015) and non-clinical (e.g., Henry & Crawford, 2005; Tonsing, 2014 ) samples and its internal consistency as well as divergent and convergent validity have been shown to be culturally indifferent given that almost all 21 items are culturally free (Ramli, Ariff, & Zaini, 2007) ; thus, the measure is suitable not only to be utilized in any clinical or non-clinical setting but also it is feasible to adapt to any culture. In general, studies have demonstrated excellent internal consistency of the scales (Norton, 2007; Antony et al., 1998; Clara, Cox, & Enns, 2001; Henry & Crawford, 2005; Gloster et al., 2008) , large convergent and divergent validity (Henry & Crawford, 2005; Osman et al., 2012; Sinclair et al., 2012) , as well as good construct validity (Mahmoud et al., 2010) .
Moreover, DASS-9 is an empirically derived version based on DASS-21, proposed by Yusoff (2013) . It has three factors (Depression, Anxiety & Stress) with 3 items each. Depression includes items “I found it difficult to work up the initiative to do things”, “I felt that I had nothing to look forward to”, and “I was unable to become enthusiastic about anything”. Anxiety comprises items “I experienced trembling (e.g., in the hands)”, “I was worried about situations in which I might panic and make a fool of myself”, and “I felt I was close to panic”. Stress comprises items “I tended to over-react to situations”, “I found myself getting agitated”, and “I was intolerant of anything that kept me from getting on with what I was doing”. According to Yusoff (2013) the shorter alternative of DASS-21 is potentially a useful screening tool for University settings to identify future students’ psychological health. In the original DASS-9, internal consistency was reported to be .72 for the total scale and .52, .57, and .55 for depression, anxiety and stress subscales respectively.
The inter-correlation between depression, anxiety and stress subscales shows that symptoms of psychological distress as measured by DASS-21 can be distinguished into three constructs and the three-factorial dimensionality of the measure has been supported in a plethora of studies (Antony et al., 1998; Lovibond & Lovibond, 1995b; Norton, 2007; Tonsing, 2014) . Although the three scales are inter-correlated, however their actual relationship is vague (Ramli, Ariff, & Zaini, 2007; Lovibond & Lovibond, 1995a) . Correlations between the three DASS-21 dimensions have been shown to be medium to large (Antony et al., 1998; Clara, Cox, & Enns, 2001; Sinclair et al., 2012) .
This study aims to evaluate two different versions of the DASS questionnaire, namely the DASS-21 (Lovibond & Lovibond, 1995a) and the DASS-9 (Yusoff, 2013) . More specifically, the purpose of this study is to examine the psychometric properties of both DASS-21 and DASS-9 in a non-clinical Greek population. This study will focus on the following: 1) to evaluate construct validity and confirm it; 2) to evaluate measurement invariance across gender, internal consistency reliability, Omega reliability, convergent validity with Average Extracted Variance (AVE) and correlation analysis, to further establish convergent/discriminant validity to other measures.
The sample is part of a study about well-being and quality of life. A total of 2272 adults (63% females) with an average age of M = 35.54 years (SD 12.35) participated in the study. Half of the participants were single (51%), followed by married (41%), and divorced (8%). The 59% of the sample did not have children. Details about recruitment of participants follow in the “Procedure” section.
Socio-demographic information collected included gender, age, marital status, whether respondent had children and work description.
Depression Anxiety Stress Scale, Short form (DASS-21)
DASS-21 is a short form of DASS-42, both proposed by Lovibond & Lovibond (1995a) . DASS-21 measures emotional distress in three 7-item dimensions: 1) depression (e.g., “I couldn’t seem to experience any positive feeling at all”); 2) anxiety (e.g., “I was aware of dryness of my mouth”), and 3) stress (e.g., “I found it hard to wind down”). All 21 items are rated on a four-point Likert scale evaluating both intensity and frequency of emotional distress during the last week (from 0 = did not apply to me at all to 3 = applied to me very much, or most of the time). The higher the score the more intense/frequent the emotions of distress. Each factor has a discrete score varying from 0 to 21. Scores greater than 14, 10 and 17 suggest extremely severe Depression, Anxiety and Stress respectively (Lovibond & Lovibond, 1995b) . Internal consistency reliability was reported α = .97 for adults of the general population (Henry & Crawford, 2005) , and for each factor alphas ranged between .81 and .97 (McDowell, 2006 cited in Yusoff, 2013 ).
Depression Anxiety Stress Scale, 9 item version (DASS-9)
A second version of DASS-21 (Lovibond & Lovibond, 1995a) was also evaluated: DASS-9 (Yusoff, 2013) with 3 items per scale instead of 7. Specifically, Depression contains items 5, 10, 16, Anxiety contains items 7, 9, 15, and Stress contains items 6, 11, 14. This version has been empirically derived from Confirmatory Factor Analysis (CFA) by Yusoff (2013) . Cronbach’s alpha for the total DASS-9 was reported by Yusoff (2013) equal to .72 whereas for Depression, Anxiety and Stress factors, alphas were .52, .57, and .55 respectively.
World Health Organization Quality of Life-Brief scale (WHOQOL-BREF)
WHO Quality of Life-Brief scale (WHOQOL, 1998a, 1998b) is a self-report assessment tool measuring aspects of perceived quality of life (QOL). It is the short version of the WHOQOL-100 (c.f. Skevington, 1999 ). It contains 26 items (e.g., “How would you rate your quality of life?”) reflecting all 24 facets of life quality that WHOQOL-100 covers. Answers are rated by four different types of a 5-point Likert scale indicating either intensity, or capacity, or frequency, or judgement (Skevington et al., 2004) . Minimum possible rating in every Likert scale is 1 (minimum perceived QOL), and maximum is 5 (maximum perceived QOL). The instrument is divided in four domains―Physical health, Psychological health, Social Relations, and Environment―with satisfactory Cronbach’s alpha, .82, .81, .68, and .80 respectively (Skevington, Lotfy, & O’Connell, 2004) . Raw scores are summed after reverse-scoring 3 items. In this study Cronbach’s alpha was . 91.
Warwick-Edinburgh Mental Well-being Scale (WEMWBS)
The WEMWBS was developed by Warwick and Edinburgh Universities (Tennant et al., 2007) . It is a 14-item unidimensional, self-report scale of mental well-being, tapping on positive aspects of subjective well-being and psychological functioning (e.g., “I’ve been feeling good about myself”). All items are phrased positively and rated on a Likert scale from 1 (None of the time) to 5 (All of the time), indicating frequency of perceived positive mental state. Responses are summed, with a minimum scale score of 14 to a maximum of 70. WEMWBS has been reported to have adequate internal consistency reliability in a student and a general population sample, was α = .89 and .91 respectively (Tennant et al., 2007) . In this study Cronbach’s alpha was .91.
Brief Resilience Scale (BRS)
BRS (Smith, Dalen, Wiggins, Tooley, Christopher, & Bernard, 2008) is the briefest measure of resilience available. It contains 6 items measuring the ability to bounce back from stress and difficulties (e.g., “I usually come through difficult times with little trouble”). The items are rated on a 5-point Likert scale from 1 (Strongly Disagree) to 5 (Strongly Agree). Possible score ranges from 1 (minimum resilience) to 6 (maximum resilience). Three items are negatively worded and are reversed scored. Smith et al. (2008) reported an adequate reliability, with Cronbach’s alpha ranging from .80 to .91 in 4 different samples. In this study Cronbach’s alpha was .80.
Flourishing Scale (FS)
The FS (Diener et al., 2009) consists of eight items describing general aspects of positive human functioning in a single factor (e.g., “I am a good person and live a good life”). Items are answered on a 1-7 Likert scale from strong disagreement (1) to strong agreement (7). All items are positively phrased. Score ranges from 8 (minimum flourishing) to 56 (maximum flourishing). Diener et al. (2009) reported an internal consistency of .87 as measured by Cronbach’s alpha. In this study Cronbach’s alpha was .81.
Scale of Positive and Negative Experience (SPANE)
This 12-item scale is a well-being measure by Diener et al. (2009) with 2 distinct factors: positive experiences (6 one-word items, e.g. “Pleasant” or “Happy”) and negative experiences (6 one-word items, e.g., “Unpleasant” and “Sad”). Items are scored on a Likert scale ranging from 1 (very rarely or never) to 5 (very often or always). The positive score (SPANE-P) and the negative score (SPANE-N) can range from 6 to 30. Their difference (Affect Balance or SPANE-B) ranges from −24 to 24. Internal consistency reliability measured by Cronbach’s alpha are .87, .81 and .89 for SPANE-P, SPANE-N and SPANE-B respectively (Diener et al., 2009) . In this study, Cronbach’s alpha for SPANE-P and SPANE-N was .90 and .85 respectively.
Mental Health Continuum-Short Form (MHC-SF)
Mental Health Continuum-Short Form (Keyes et al., 2008) is a self-report 14-item questionnaire, measuring the three aspects of well-being proposed by Keyes (2002) : emotional (EWB), social (SWB) and psychological or PWB (e.g., “How often did you feel happy?”). Items are rated on a 6-point Likert scale, suggesting the frequency of experiences (never, once or twice a month, about once a week, two or three times a week, almost every day, every day) during the past month. Internal consistency reliability for the total MHC-SF scale (Cronbach’s alpha) reported by Lamers et al. (2011) to be significant (.89). In this study, Cronbach’s alpha was .90.
The Gratitude Questionnaire (GQ-6)
The GQ-6 (McCullough, Emmons, & Tsang, 2002) is a six-item self-report questionnaire to evaluate the subjective proneness to gratitude experience in everyday life (e.g., “I am grateful to a wide variety of people”). Respondents answer each item on a 7-point Likert scale (from 1 = strongly disagree to 7 = strongly agree). GQ-6 has a single factor structure. Scores range from 6 (less grateful) to 42 (most grateful) after reversing items 3 and 6. The internal consistency reliability of the scale reported to be α = .82 (McCullough, Emmons, & Tsang, 2002) . In this study Cronbach’s alpha was .68.
Meaning in Life Questionnaire (MLQ)
The MLQ (Steger et al., 2006) is a 10-item measure of presence and search for meaning in life, with five items in each factor (Presence of Meaning/MLQ-P and Search for Meaning/MLQ-S). An example item is “I have a good sense of what makes my life meaningful”. Items are rated on a 7-point Likert scale (from “Absolutely True” to “Absolutely Untrue”). Possible factor scores range from 5 (lowest presence of/search for meaning) to 35 (highest) after reversing item 9. In this study Cronbach’s alpha was .78.
Satisfaction with Life Scale (SWLS)
The Satisfaction with Life Scale (Diener, Emmons, Larsen, & Grifin, 1985) is a five-item measure of perceived life satisfaction (e.g., “The conditions of my life are excellent”), rated on a 7-point Likert scale that ranges from 1 (Strongly Disagree) to 7 (Strongly Agree). The higher the score the greater the perceived satisfaction. Possible scores vary from a minimum of 5 to a maximum of 35 (Pavot & Diener, 1993) . Internal consistency reliability (Cronbach’ s alpha) was reported from .79 to .89 (Pavot & Diener, 1993) . In this study Cronbach’s alpha was .88.
Trait Hope Scale (HS)
Trait Hope Scale (Snyder et al., 1991) is a 12-item, self-report assessment tool of dispositional hope (e.g., “I can think of many ways to get out of a jam”). HS has two distinct but related factors: Agency and Pathways (Bronk et al., 2009). Answers are rated on an 8-point Likert scale, from 1 (Definitely False) to 8 (Definitely True) with a score range from a minimum of 8 to a maximum of 64. Higher scores suggest more hopeful and resourceful individuals. Snyder et al. (1991) reported that for the total scale, Cronbach’s alphas varied from .74 to .84. In this study Cronbach’s alpha was .89.
Data were collected with the assistance of psychology students who sent a link of the test battery in electronic format (Google Forms®) to 15 adults of their social environment. Specifically, about 150 students from 2 different psychology courses volunteered to participate in the study in return to partial extra credit. All participants the students recruited participated by providing their e-mail to ensure their case was unique. All the fields in the digital battery were required to eliminate missing values. Participants were informed about the purpose of the study and anonymity of participation. The process of data collection was the following. First, students received a short course on the administration of psychology questionnaires by the research team members. Then, a period of pilot-testing the digital test battery followed to track any flaws in the digital procedure and to record the time required to complete the battery (approximately 15 minutes). Finally, after successful pilot testing, students were provided with a link to the official study.
2.4. Research Design
The sample was split in three parts to study the construct validity of DASS-21 and DASS-9 of 3 different subsamples (see Table 1). Research was carried out on two levels: 1) on three subsamples (EFA, CFA1 and CFA2) to evaluate construct validity and crosscheck it; 2) on the full sample, to evaluate measurement invariance across gender, internal consistency reliability, construct reliability, AVE convergent validity, and relationship with other constructs. Specifically, in the first subsample (EFA subsample), Exploratory Factor Analysis (EFA) and Bifactor Exploratory Factor Analysis (EFA Bifactor) were carried out. Independent Cluster Model Confirmatory Factor Analysis (ICM-CFA), Bifactor Confirmatory Factor Analysis (Bifactor CFA) and Exploratory Structural Equation Modeling (ESEM) were applied in the second subsample (CFA1 subsample), testing several alternative solutions. In the third subsample (CFA2 subsample), the optimal CFA model that emerged, was revalidated. Then, a multi-group CFA (MGCFA) was carried out in the entire sample (N = 2271) to test for the measurement invariance across gender. Reliability analysis (α and ω) followed in the entire sample. Finally, convergent validity (with AVE) and convergent/discriminant validity through correlation analysis were performed in the total sample using measures of emotionality, well-being, positivity and quality of life. All the above
Table 1. Overview of the analysis procedure adopted.
procedures were carried out both for DASS-21 and DASS-9 (see Table 1). Data were collected electronically on Google Forms® and were analyzed with SPSS, Version 25, (IBM, 2017) , Stata Version 14.2 (StataCorp, 2015) and MPlus Version 7.0 (Muthen & Muthen, 2012) .
3.1. Missing Values Analysis and Data Management
The total sample comprised N = 2272 cases. Data had zero missing values in all variables because all fields of the digital test-battery were required (see Procedure section).
To validate the DASS-21 factor structure, the total sample (N = 2272) was randomly split into three parts. Sample-splitting (Guadagnoli & Velicer, 1988; MacCallum, Browne, & Sugawara, 1996) is considered a cross-validation method of construct validity because the researcher replicates the optimal CFA model established in sample X to sample Z (Byrne, 2010; Brown, 2015) . The first 20% of the total sample (NEFA = 452) was used for EFA, the second 40% (NCFA1 = 910) for CFA, and the third 40% (NCFA2 = 910) for a second CFA. The purpose of the second CFA was to cross-validate the optimal model established in CFA1 subsample in a different subsample with equal sample power. We have termed the above analysis pattern “the 3-faced construct validation method”. This method was implemented for the DASS-9 too. For DASS-21 the sample-to-variable ratio (N/P) for the EFA subsample (N = 452), CFA1 subsample (N = 910) and CFA2 subsample (N = 910) was 21.52/1 and 43.33/1 respectively. Sample power for DASS-9 was also adequate, N/PEFA = 50.22, N/PCFA1 = 101.11, N/PCFA2 = 101.11. A sample-to-variable ratio of 10:1 or greater (Osborne & Costello, 2004) or alternately 500 - 1000 cases are generally regarded from adequate to excellent for factor analysis (Comrey & Lee, 1992; Singh et al., 2016) .
3.2. Univariate and Multivariate Normality
The data in all four samples (Total, EFA, CFA1 and CFA2) violated the normality assumption. Kolomogorov-Smirnov tests (Massey, 1951) on each of the DASS-21 and DASS-9 items were statistically significant (p < .001), indicating a univariate normality deviation. Multivariate normality was estimated by the following four tests: 1) Mardia’s multivariate kurtosis test (Mardia, 1970) ; 2) Mardia’s multivariate skewness test (Mardia, 1970) ; 3) Henze-Zirkler’s consistent test (Henze & Zirkler, 1990) , and 4) Doornik-Hansen omnibus test (Doornik & Hansen, 2008) . The null hypothesis was rejected for all four tests (with all p values < .0001), suggesting a violation of multivariate normality of the DASS-21 and DASS-9 scores in all four samples (Total, EFA subsample, CFA1 subsample, CFA2 subsample).
3.3. Exploratory Factor Analysis (EFA) in DASS-21 and DASS-9
EFA and later CFA were performed with MLR rescaling-based estimator (c.f. MPlus 7.1, Muthen & Muthen, 2012 ). MLR can handle non-normal distributions and, unlike other similar methods, is able to estimate standard errors and chi-square test statistics (Wang & Wang, 2012) . Additionally, MLR is appropriate for medium samples (Bentler & Yuan, 1999; Muthen & Asparouhov, 2002) , here after sample-splitting. Taking into account the above properties, MLR was used as an estimator. The factors were rotated with Geomin factor rotation in the standard EFA model. In this study, the Exploratory Bifactor Analysis method by Jennrich and Bentler (2011) was used to test an EFA Bifactor model both for DASS-21 and DASS-9. EFA model fit was evaluated by the following criteria (Hu & Bentler, 1999; Brown, 2015) : RMSEA (≤ .06, 90% CI ≤ .06), SRMR (≤ .08), CFI (≥ .95), TLI (≥ .95), and the chi-square/df ratio less than 3 (Kline, 2016) .
For both DASS-21 and DASS-9, two exploratory factor models were tested in the EFA subsample (N = 472). First, a standard EFA was performed to evaluate the fit of the original 3-factor (Depression, Anxiety and Stress; Lovibond & Lovibond, 1995a ) and to have a benchmark for the EFA Bifactor (Jennrich & Bentler, 2011) model tested next. Especially for DASS-9, EFA was carried out to establish a factor structure because DASS-9 is an empirically derived measure (with post hoc modifications by Yusoff, 2013 ). Second, a Bifactor EFA model was tested. Reise et al. (2007) recommended the evaluation of a Bifactor model as a generally good practice when establishing construct dimensionality (c.f. Hammer & Toland, 2016 ). This Bifactor model had a General Distress factor and three specific factors (Depression, Anxiety and Stress). This structure is based on the Bifactor CFA model proposed by Henry & Crawford (2005) . For DASS-21, fit measures (see Table 2) stayed within the required limits, suggesting that both solutions achieved good fit. CFI and TLI were far above the 0.95 and RMSE as
Table 2. EFA and bifactor EFA fit measures for DASS-21 and DASS-9.
MLR estimator, Geomin & BiGeomon Rotation.
low as .035. The DASS-21 Bifactor solution had a better Chi-square and a slightly better SRMR. Both Chi-square/df ratios were also satisfactory (about 1.55), indicating optimal fit to the data (Kline, 2016) .
Regarding the EFA models of the DASS-9: 1) In the standard EFA model all fit indices achieved a very good fit. Chi-square was 34.53, Chi-square/df was 1.82, RMSE, CFI, and TLI were also adequate and SRMR was more than acceptable. 2) In the DASS-9 Bifactor EFA model all measures achieved a significant fit. Chi-square was 2.93 and Chi-square/df was .49. CFI, TLI and RMSEA achieved the maximum possible values (1.000, 1.000, 0) and SRMR was 0.006 (see Table 2 for details). In the next phase, we examined the DASS-21 and DASS-9 factor structure with Confirmatory Factor Analysis.
3.4. Confirmatory Factor Analysis (CFA) in DASS-21 and DASS-9
CFA model fit was evaluated by the following criteria (Hu & Bentler, 1999; Brown, 2015) : RMSEA (≤ .06, 90% CI ≤ .06), SRMR (≤ .08), CFI (≥ .95), TLI (≥ .95), and the chi-square / df ratio less than 3 (Kline, 2016) .
For DASS-21 the following models were tested (see Table 3). In MODEL1, a unidimensional Independent Cluster Model of CFA (ICM-CFA) with all 3 factors collapsed in a General Distress factor (Lovibond & Lovibond, 1995b) was tested according to maximum parsimony assumption (Brown, 2015) . In MODEL 2, a two-factor ICM-CFA model tested by Lovibond and Lovibond (1995b) with Depression in one factor and Anxiety plus Stress combined in a second factor (also tested subsequently by Henry & Crawford, 2005 , and others). A variation of this model was also tested by adding error covariances (MODEL 3). In MODEL 4, we tested the original ICM-CFA model proposed by Lovibond & Lovibond (1995b) , with Depression, Anxiety and Stress in 3 factors. A variation of this model was also checked by adding error covariances (MODEL 5). In MODEL 6 the dual factor ICM-CFA model proposed by Duffy et al. (2005) was tested with a Generalized negativity factor (items 1, 3, 5, 6, 8 - 18, 20, 21) and a Physiological Hyperarousal factor (items 2, 4, 7, 19). In MODEL 7 we tested an alternative 3-factor model proposed by Duffy et al. (2005) with Anhedonia (items 3, 10, 16, 21), Physiological hyper-arousal (items 2, 4, 7, 19) and Negative Affect (items 1, 5, 6, 8, 9, 11, 12, 13, 14, 15, 17, 18, 20). MODEL 8 is the Quadripartite model proposed by Henry & Crawford (2005) . This is essentially a 3-factor Bifactor Model with a general Distress factor and Depression, Anxiety and Stress as specific factors (see B in Figure 1). MODEL 9 is the Tripartite model (Tully et al., 2009; Willemsen et al., 2011) , actually a 2-factor Bifactor Model with all items loading on a General Negative Affect factor and Depression and Anxiety as specific factors (see C in Figure 1). MODEL 10 is an ESEM model (never tested before) with the original structure proposed by Lovibond & Lovibond (1995b) . MODEL 11 is variation of MODEL 10 with error covariances added (see A in Figure 1). We did not test a higher order model (as initially tested―but not verified―by Lovibond & Lovibond, 1995b ) because for a 3 first-order factor structure, like DASS-21, the second-order is just identified, therefore judging model improvement over the first order solution is impossible (Wang & Wang, 2012; Brown, 2015) .
Table 3. CFA Fit Statistics for DASS-21 and DASS-9.
MLR estimator was used in all models.
Figure 1. Some alternative CFA models tested for DASS-21 represented as path diagrams: (a) The 3-factor ESEM model with error covariates; (b) the 3-factor Bifactor and (c) the 2-factor Bifactor.
Specifically, the fit statistics (Table 3) for DASS-21 were the following. MODEL 1 was a poor fit with all measures failing to reach the desired limits. MODEL 2 had an unacceptable fit and its variation (MODEL 3) hardly attained desired fit values. The original 3-factor Model (MODEL 4) showed adequate fit. Its corresponding covariate model (MODEL 5) showed a very good fit, Chisquare/df 2.97, RMSE = .046, CFI = .948 and TLI = .940 and SRMR = .036. MODEL 6 and 7 had a poor fit. MODEL 8―the 3-factor Bifactor Model (Quadripartite; Henry & Crawford, 2005 ; B, Figure 1) showed acceptable fit. The 2-factor Bifactor Model (Tripartite; Tully et al., 2009 ; Willemsen et al., 2011 ) presented equally acceptable fit (see C Figure 1). The fit of the two ESEM models tested (MODEL 10 and 11) also presented adequate fit. The ESEM model with error covariances (Model 11, see A Figure 1) had all fit measures in more satisfactory levels (see Table 3 for all fit measures).
Finally, for DASS-21 the fit statistics of three models showed comparably optimal fit: 1) The original model (MODEL 5; Lovibond & Lovibond, 1995b ) with error covariances, 2) the 3-factor Bifactor model (MODEL 8; Henry & Crawford, 2005 ), see Figure 1 and 3) The 3-factor ESEM model with error covariances. Next, for these three competing models (see Figure 1) we examined the factor loadings and factor intercorrelations (see Table 4) to select the optimal model.
Likewise, For DASS-9 the following models were evaluated (see Table 3). In MODEL1, we tested an ICM-CFA model with 9 items, in 3 factors (Depression, Anxiety and Stress) having 3 items each (proposed by Yusoff, 2013 , based on the original model by Lovibond & Lovibond, 1995b ). A variation of this model (MODEL 2) was also checked by adding error covariances. In MODEL 3, we tested a 12-item, ICM-CFA model proposed by Yusoff (2013) with 3 four-item factors. In MODEL 4 we tested the DASS-9 with the Quadripartite model (Henry & Crawford, 2005) , i.e. a 3-factor Bifactor model. MODEL 5 is a 3-factor ESEM model (never tested before) with the original structure proposed by Lovibond & Lovibond (1995b) and MODEL 6 is variation of MODEL 5 with error covariances added. All models tested for DASS-9 achieved adequate fit (see Table 3). The 3-factor ESEM model showed a notably good fit.
3.5. Factor Loadings and Intercorrelations
Next, to decide on the optimal DASS-21 and DASS-9 model we compared the factor loadings and factor intercorrelations of 1) 3-factor ICM-CFA with error covariances; 2) the 3-factor Bifactor model and 3) the 3-factor ESEM model with error covariates see Table 4.
For DASS-21 only the factor loadings of the original 3-factor ICM-CFA with error covariances showed acceptable loadings from .554 (in Anxiety) to .784 (in Stress). For the ESEM model with error covariances added, all loadings were from negative to < .3. For the 3-factor Bifactor (Quadripartite) model loadings were also unsatisfactory, though better than the ESEM model, from .157 to .821. Besides, dimensionality only based on Bifactor models has been criticized as doubtful because they always tend to show adequate fit (Joshanloo, Jose, & Kielpikowski, 2017; Joshanloo & Jovanovic, 2016) . Factor intercorrelations (Table 4) were on average M = .83 for the original model (Lovibond & Lovibond, 1995b) and M = .71 for the ESEM with error covariances. Despite that ESEM by default allows cross-loadings while ICM CFA constrains them to zero
Table 4. Comparison of Factor Loadings and Factor Intercorrelations for competing optimal models for DASS-21 and DASS-9 for sub-sample 1.
MLR estimator was used in all models.
(Asparouhov & Muthen, 2009; Marsh et al., 2009, 2010) , thus inflating factor intercorrelations (Joshanloo & Jovanovic, 2016) , intercorrelations of the 3-factor ESEM model were marginally lower than the 3-factor ICM-CFA model (the original) with disappointing factor loadings (Table 4).
Therefore, for DASS-21, considering fit measures, loadings and factor correlations, the original 3 factor model finally showed an overall optimal fit in the CFA 1 subsample (N = 910). The same pattern of loadings and intercorrelations was repeated for DASS-9 (Table 4) with the original 3-factor structure having the highest loadings, from .598 (Stress) to .767 (Depression), and unsatisfactory factor loadings for the ESEM with error covariances and for the Bifactor model too. Factor intercorrelations for the DASS-9 were on average M = .85 for the original 3-factor model with covariances and M = .70 for the ESEM model with covariances. Thus, also for DASS-9 taking into account fit measures, loadings and factor correlations, the original 3-factor model with covariances presented an overall optimal fit in the CFA 1 subsample (N = 910).
3.6. Cross-Validation of the DASS-21 and DASS-9 Factor Structure in a Different Sample
After determining that the original 3-factor ICM-CFA model with error covariates (see Figure 2), was the optimal model for DASS-21 and DASS-9, as derived from the CFA1 subsample, a cross-validation of this model followed to verify model fit in a second subsample (CFA2, N = 910) of equal power to
Figure 2. Path diagrams of the optimal models for DASS-21 (Left) and for DASS-9 (right) derived from subsample 1 and successfully cross-validated in the subsample 2, using 3-faced construct validation method.
subsample 1 (CFA1, N = 910). This is the final step of what we call the “3-faced construct validation method”. As shown in Table 5, all fit statistics were acceptable and for DASS-21 also very stable in comparison to fit measures of the CFA1 subsample. See Figure 2 for a path diagram of the optimal DASS-21 and DASS-9 models, deriving from the subsample CFA2. The factor loadings and intercorrelations of these cross-validated models were generally adequate and comparable across the two subsamples for both DASS-21 and DASS-9 (also presented in Figure 2). After this successful validation of the original 3-factor model for both DASS versions, we used it as a baseline model to test measurement invariance of DASS-21 and DASS-9 across gender.
3.7. Measurement Invariance for DASS-21 and DASS-9
For both DASS-21 and DASS-9, we examined measurement invariance across gender in the entire sample (N = 2272). The invariance criteria used were ΔCFI ≤ −.01, and ΔRMSEA ≤ .015 (Chen, 2007) . For DASS-21 gender invariance of the 3-factor ICM CFA model was tested separately in each gender group, as a baseline model (males, N = 832 versus females, N = 1440). This model had a very good fit for males (Chi-square 477.35, Chi-square/df = 2.60, RMSEA = .044, CFI = .954) and sufficiently good for females (Chi-square 916.40, Chi-square/df = 5.00, RMSEA = .053, CFI = .941). Then, this baseline model was tested in both gender groups concurrently. This model (M1) showed acceptable fit (see Table 6), suggesting that configural invariance was supported. Then, factor loadings were constrained to equality. As shown in Table 6, both ΔCFI and ΔRMSEA for this constrained model (M2) indicated weak invariance. Then, all intercepts were forced to be equal (M3), and both ΔCFI and ΔRMSEA showed strong invariance. Finally, for the last test of measurement invariance (Wang & Wang, 2012) , error variances were constrained to equality and ΔCFI and ΔRMSEA suggested that strict measurement invariance is supported.
Table 5. Fit measures for DASS-21 and DASS-9 for the optimal model established in CFA1.
MLR estimator was used in all models.
Table 6. Fit measures of the nested models tested to validate measurement invariance.
The 3-factor original model (by Lovibond & Lovibond, 1995b ) was the baseline model. MLR estimator was used in all models.
For DASS-9, the procedure of measurement invariance across gender was repeated. The baseline model emerging from CFA2―the 3-factor original model presented very good fit for males (Chi-square 38.14, Chi-square/df = 1.66, RMSEA = .028, CFI = .992), and sufficient fit for females (Chi-square 4.94, Chi-square/df = 4.94, RMSEA = .052, CFI = .975). Then, this model was tested concurrently in both gender groups. The fit of this model (M1) supported configural invariance (see Table 6). Then, factor loadings (MODEL 2), intercepts (MODEL 3) and error variances (MODEL 4) were consecutively constrained to equality. Model comparison indicated that all fit measures (ΔCFI and ΔRMSEA) were satisfactory both for MODEL 2 to 1 (weak invariance), for MODEL 3 to 2 (strong invariance) and finally for MODEL 4 to 3 (strict invariance).
3.8. Reliability and Validity for DASS-21 and DASS-9
The reliability and validity of DASS-21 and DASS-9 were evaluated in the entire sample (N = 2272) with the following measures; 1) Cronbach’s alpha (α; Cronbach, 1951 ) to calculate internal consistency reliability of the responses. For α, values above .70 are considered acceptable, and above .80 adequate (Kline, 2000; Nunnally & Berstein, 1995) . 2) McDonald’s omega reliability coefficient (ω; McDonald, 1999 ). Omega was used to examine construct reliability (c.f. Hoque et al., 2017 ). Omega, can be calculated for the whole scale to correspond to variance accounted by all factors or for each latent factor (Brunner et al., 2012). Omega values greater than .70 are acceptable (Hair, Babin, & Anderson, 2010) . 3) Average Variance Extracted (AVE; Fornell & Larcker, 1981 ). AVE is essentially a measure of convergent validity. However, Malhotra & Dash (2011) suggest that on the basis of ω alone, the validity of the construct may appear adequate, despite an error variance as high as 50%, thus AVE complements ω however it is a more conservative measure. The threshold for AVE is .50 ( Hair, Babin, & Anderson, 2010 ; Fornell & Larcker, 1981 ).
Overall internal reliability for the entire DASS-21 was substantial and for each factor significant (M = .89). Overall alpha for DASS-9 was adequate and alphas per factor were also adequate (M = .76); see Table 7. For the total DASS-21, omega was equally substantial and for each factor it was on average M = .81, indicating that the mean percentage of variance explained by each DASS-21 factor score is 81%. For the total DASS-9, overall omega was also substantial (.91) and for each DASS-9 factor it was on average, M = .76, meaning that the mean percentage of variance explained by each DASS-9 factor score is 76%. Regarding the AVE for DASS-21, all values were acceptable, M = .53. For DASS-9 Mean AVE was marginally sufficient, M = .50 (Table 7).
3.9. Correlation Analysis of DASS-21 & DASS-9
The relationship of DASS-21 & DASS-9 with other measures was examined in the entire sample (N = 2272) to evaluate discriminant and convergent validity
Table 7. Reliability and AVE convergent validity of DASS-21 and DASS-9.
further. Dimensions evaluated include negative/positive emotionality (SPANE), overall mental well-being (WEMWBS), more specific well-being aspects like SWB, PWB and EWB from the MHC-SF, resilience to bounce back from stress and setbacks (BRS), flourishing (FS), satisfaction with life (SWLS), dispositional hope (HOPE-Agency and HOPE-Pathways) and finally presence of and search for meaning in life (MLQ). We also examined the relationship of depression, anxiety and stress with gratitude (GQ-6), and with aspects of life quality like Physical health, Psychological health, Social Relations and Environment (WHOQOL- BFEF).
Regarding Stress, it was moderately negatively correlated with SPANE Positive Experiences (r = −.40 for DASS-21 and −.35 for DASS-9) and highly positively correlated with SPANE Negative Experiences (r = .52 for DASS-21 and .45 for DASS-9), all p < .01. The opposite was true for Affect Balance. Stress had negative, moderate to strong correlations with well-being measures, DASS-21 from r = −.24 (MHC-SF Social well-being) to −.44 (WEMWBS), Mean r = −.33, DASS-9 from r = −.22 (MHC-SF Social well-being) to r = −.38 (WEMWBS), Mean r = −.29, all p < .01. Stress showed equally moderate to strong negative correlations with all other measures of positivity, DASS-21 from r = −.21 (Gratitude) to r = −.40 (Brief Resilience Scale), followed by HOPE (r = −.28) with all p < .01, Mean r = −.23. DASS-9 showed a similar pattern, Mean r = −.19. MLQ search was the only exception (all p < .01). Finally, Stress was moderately, negatively and significantly correlated with all aspects of life quality (for DASS-21 Mean r = −.37, for DASS-9 Mean r = −.33), all p values < .01.
Anxiety correlations with SPANE ranged from moderately negative, for DASS-21 r = −.32 (SPANE-P) to strongly positive, r = .45 (SPANE-N), all p values < .01 and for DASS-9 in the same direction but of lower magnitude (Table 8). The correlations of Anxiety with well-being dimensions were all negative
Table 8. Correlation analysis of DASS-21 & DASS-9.
All p values were < .01. Correlations for the total DASS-21 and DASS-9 were not included since DASS offers 3 distinct scores for the 3 measured constructs (Depression, Anxiety and Stress). Bold indicates correlations of equivalent factors of DASS-221 and DASS-9.
and significant (ps < .01), ranging for DASS-21 from r = −.18 (MHC-SF Social WB) to r = −.37 (WEMWBS), repeating the pattern emerged for Stress, Mean r = −.28 and for DASS-9 the identical pattern from −17 to −36, Mean r = −.26. Regarding the other positivity measures, For DASS-21, Anxiety also presented a similar correlation scheme to Stress, with significant moderate negative correlations to all dimensions (M = −.21), except MLQ Search for Meaning. Anxiety, like Stress also had an almost strong, negative correlation of r = −.40 with resilience, followed by MLQ-Presence (r = −.28) and HOPE (r = −.25). For DASS-9 a similar pattern emerged with resilience (r = −.41), then MLQ-Presence (r = −.27) and then HOPE (r = −.26). Anxiety was also moderately, negatively and significantly correlated (all p values <0.01) with all aspects of life quality (DASS-21, Mean r = −.33, DASS-9, Mean r = −.30).
Depression was strongly negatively correlated to SPANE Positive Experiences and to Affect Balance. The opposite was true for SPANE Negative Experiences both for DASS-21 and DASS-9. Depression had strong negative and significant correlations with all Well-Being dimensions, for DASS-21 Mean r = −.45, and for DASS-9 Mean r = −.42 (p values < .01). Social WB had again in the lowest correlation to Depression and WEMWBS the highest for both DASS versions. Regarding the other positivity constructs, for Depression all correlations were significant (p <.01), moderate to strong and negative (for DASS-21 M = −.33 and for DASS-9 M = −.31), except MLQ Search for Meaning. Like Anxiety and Stress, Depression had the strongest negative correlation (r = −.45) with resilience, Presence of life meaning and Hope and the weakest with Gratitude. The same pattern was observed for both DASS versions. Depression had also a moderate to strong, negatively correlations (p values < .01) with all aspects of life quality DASS-21 (Mean r = −.44 and DASS-9, Mean r = −.40). The results of the correlation analysis for DASS-21 are presented in Table 8.
3.10. Correlations of DASS-21 with DASS-9
The correlations of the DASS-21 factors with the DASS-9 factors were examined in the entire sample (see Table 8). DASS-21 Stress was highly positively correlated with DASS-9 Stress (r = .92, p < .01). Similarly, Anxiety and Depression of DASS-21 showed a remarkably strong, positive and significant correlation with their equivalent DASS-9 factors (r = .90 and r = .93 respectively), all p values < .01 (see Table 8 for details).
3.11. Descriptive Statistics of the Scores for DASS-21 & DASS-9
For each DASS-21 subscale the percentage of participants that fall in each category denoting the severity of symptoms (normal, mild, moderate, severe and extremely severe) is listed below. The percentage of participants categorized as “normal” for Depression, Anxiety and Stress were 58.71%, 60.61% and 57.61%, respectively (see Table 9 for details).
Regarding DASS-9, the means and ranges for Stress, Anxiety and Depression factors are presented in Table 10 for the total sample (N = 2272). Means cannot
Table 9. Summary statistics for DASS-21 in the total sample.
*Range defined according to Lovibond and Lovibond’s (1995a) cut-offs for each DASS category.
Table 10. Summary statistics and raw scores of DASS-9 converted to percentiles.
be used to estimate individual scores, given the non-normality of the data. Therefore, Table 10 was included for conversion of DASS-9 scores on Stress, Anxiety and Depression to percentiles. The 50% of the respondents in DASS-9 Stress, Anxiety and Depression score ≤ 3, ≤ 1 and ≤ 1 respectively (see Table 10).
This study sought to examine the dimensionality of two different versions of the DASS questionnaire, namely the DASS-21 and the DASS-9. The factor structure, measurement invariance across gender, internal consistency reliability, construct reliability and convergent/discriminant validity were studied for the two versions. All analyses were executed twice, first for DASS-21 and then for DASS-9. Methods used in examining the factor structure of the two DASS versions included EFA, EFA Bifactor, CFA Bifactor ICM-CFA and ESEM. After sample splitting, three different subsamples were created to ensure the validity of the findings implementing a cross-validation process we termed the “3-faced construct validation method”.
Dimensionality was first examined with EFA in a 20% subsample with standard EFA and Bifactor EFA. As a rule, when examining the dimensionality of psychological constructs, relying on an EFA measurement model is usually a prerequisite (Howard, Gagne, Morin, Wang, & Forest, 2016) . EFA results suggested that all DASS-21 and DASS-9 EFA models tested achieved a good fit. In the next phase, we examined the DASS-21 and DASS-9 factor structure with CFA in the second 40% of the sample. We tested 11 alternative models for DASS-21 and 6 for DASS-9. Among the DASS-9 alternative models tested, we included a 12-item version of DASS-21 (proposed by Yusoff, 2013 ).
Among the alternative CFA structures tested for DASS-21, three models showed comparably good fit: 1) The original model (Lovibond & Lovibond, 1995b) with error covariances (items 4 - 9, 15 - 20), 2) The 3-factor Bifactor model (Quadripartite; Henry & Crawford, 2005 ), and 3) The 3-factor ESEM model with error covariances in item 9 - 15. After a closer look to the factor loadings and factor intercorrelations of these three competing models we exclude the 3-factor Bifactor model (or Quadripartite; Henry & Crawford, 2005 ) and the ESEM model (Figure 1) because their factor loadings were unsatisfactory.
DASS-9 results presented a similar overarching scheme, with adequate fit measures for the 3 factor ICM CFA model (proposed by Yusoff, 2013 ) with covariances but disappointing factor loadings for the 3-factor Bifactor and the 3-factor ESEM model with covariances. Thus, all above considered, the original 3-factor ICM CFA structure (Lovibond & Lovibond, 1995b) presented the optimal fit for both DASS-21 and DASS-9. The 12 item DASS showed acceptable fit statistics, but we focused on DASS-9 because it is more parsimonious, being briefer.
Subsequently, we cross-validated the two optimal models (one for each DASS version) that emerged from the first CFA by performing a second CFA in a different subsample of equal power (40%). This is the last step of the “3-faced construct validation method”. Considering the cross-validation findings (fit measures, factor loadings and factor intercorrelations), the original 3-factor ICM CFA solution proposed by Lovibond & Lovibond (1995b) showed the best overall fit for both DASS-21 (with error covariances in items 4 with 9 and 15 with 20; Figure 2) and for DASS-9 (with an error covariance in items 7 with 15; Figure 2). This finding confirms previous research providing support for the 3-factor structure of the DASS-21 (Antony, Bieling, Cox, Enns, & Swinson, 1998; Brown, Chorpita, Korotitsch, & Barlow, 1997; Clara, Cox, & Enns, 2001; Henry & Crawford, 2005; Sinclair et al., 2012; Oei et al., 2013; Camacho et al., 2016) . The same is true for DASS-9. Brown et al. (1997) and Henry and Crawford (2005) alternatively proposed that the three factors of negative emotionality measured by the DASS correspond to the Tripartite model by Clark & Watson (1991) . Specifically, DASS Depression parallels to anhedonia of the tripartite model, Anxiety to Physiological Hyperarousal and Stress to Negative Affect (c.f. Willemsen et al., 2011; Yıldırım et al., 2018 ). This alternative model (Duffy et al., 2005) was tested but was not verified in our data.
The second best fit for both DASS-21 and DASS-9 was achieved by the 3-factor Bifactor model, actually confirming the Quadripartite model, proposed by Henry & Crawford (2005) and by others (Le et al., 2017; Tran et al., 2013; Gomez et al., 2014; Shaw et al., 2016; Randall et al., 2017) . A 2-factor variation was also proposed by Tully et al. (2009) , and further supported by Osman et al. (2012) , and Willemsen et al. (2011) . However, dimensionality of a construct based only on Bifactor analysis ( Schmid & Leiman, 1957 ; c.f. Reise, 2012 ) is often questionable in empirical research (Joshanloo & Lamers, 2016) . This rationale seems to be confirmed in our case since fit for both the Bifactor models and the ESEM models were good but loadings were unacceptable. Moreover, despite that ICM-CFA models with cross-loadings constrained to zero typically result in inflated CFA factor correlations (Asparouhov & Muthen, 2009; Marsh et al., 2009, 2010) , both the ESEM and the Bifactor model for the two DASS versions had marginally lower factor intercorrelations in comparison to the 3-factor original ICM-CFA model (our optimal model). The intercorrelations of the optimal 3-factor model (with covariances) were strong but within acceptable limits (<.85; Brown, 2015 ). They were also marginally above the 0.5 - 0.7 range reported in literature for analogous models (Shawn et al., 2016).
The pattern of optimal models that emerged for both DASS-21 and DASS-9 is analogous. Additionally, in confirmation to the good fit of DASS-9, empirical research proposed shorter alternatives for DASS-21 either explicitly (Yusoff, 2013) or by suggesting revision of 12 items (Osman et al., 2012) . Additionally, the unidimensional or dual-factor structures with collapsed factors showed inadequate fit, in consistency with research by others (Szabo, 2010; Osman et al., 2012; Tully et al., 2009; Daza et al., 2002) . The good fit of the original model suggests that DASS-21 and DASS-9 tap a multidimensional construct, best measured by discrete Depression, Anxiety, and Stress factors (Randall et al., 2017) .
Both DASS-21 and DASS-9 were gender invariant. Factor loadings, item intercepts, as well as error variances, found to be invariant across gender, suggesting that strict measurement invariance can be supported. This means we can safely use DASS in both genders, without worrying about response bias, a significant issue in depression research (c.f. Brown, 2015 ).
Reliability was more than adequate for total DASS-21, suggesting that the 21 items are answered consistently. The three factors of DASS-21 presented equally high internal consistency, equally high or higher than the original DASS-21 (Lovibond & Lovibond, 1995a, 1995b) , or other versions proposed in empirical literature (e.g. Antony et al., 1998; Willemsen et al., 2011; Sinclair et al., 2012; Osman, et al., 2012; Fox et al., 2017 ). DASS-9 showed also more than adequate internal reliability―both overall and for each factor―despite the lower number of items per factor, showing that the 9 items possess internal consistency, higher in comparison to Cronbach’s alpha reported by Yusoff (2013) . Omega reliability was equally substantial for both versions and for DASS-21 comparable to other research findings with that suggested the original model as optimal (Camacho et al., 2016) . Convergent validity measured by Average Variance Extracted was also adequate.
To further examine the convergent and discriminant validity of DASS-21 and DASS-9, the following four groups of psychological constructs were evaluated: 1) Emotionality (SPANE; Diener et al., 2009 ), 2) Well-being, 3) Other positive psychology measures that empirically are negatively associated with mental distress (resiliency and hope) and 4) Quality of life (Tennant et al., 2007) . Specifically, we tested 19 dimensions in 10 different questionnaires. A consistent pattern of relations emerged. Depression showed the strongest relations with all other constructs measured, followed by Anxiety and then by Stress. In the Well-Being correlations group, among the 7 different dimensions included, Social relationships (MHC-SF; Keyes, 2002 ) was at the lowest end of the range and WEMWBS ( Tennant et al., 2007 ) at the highest, for all 3 DASS factors. Regarding the correlations in positivity constructs group, Depression had strong negative correlation with them, while Anxiety and Stress had negative correlations of moderate to strong magnitude. Gratitude (GQ6; McCullough et al., 2002 ) was consistently at the lowest end of the range and resilience consistently at the highest, followed by either Hope, Hope Agency (Snyder et al., 1991) or Presence of life meaning (MLQ; Steger, 2006 ). Stress and Anxiety were moderately, negatively correlated with Quality of life (WHOQOL-BREF; by WHO, 2004), while Depression had a moderate to strong negative correlation with life quality. In sum, Depression, showed an equivalent correlational pattern to Stress and Anxiety, with wider and stronger correlations. The prominently high negative correlations of resilience to mental distress that DASS quantifies, have been studied extensively in the empirical literature (Fredrickson, Tugade, Waugh, & Larkin, 2003; Connor, & Davidson, 2003; Smith et al., 2008; Fredrickson et al., 2008) and they were verified by the present research. The same is true for the construct of Hope (Snyder, 1994, 2002) , having the second highest negative relationship with mental distress after resilience. Crucially, this pattern was repeated for all three of the DASS-21 and DASS-9 factors. Finally, probably the most noteworthy finding that emerged from the correlation analysis is the notably high relation of the DASS-21 factors with their equivalent DASS-9 factor. Finally, DASS-21 findings on convergent and discriminant validity are generally in line with other research findings (Antony, Bieling, Cox, Enns, & Swinson, 1998; Brown, Chorpita, Korotitsch, & Barlow, 1997; Clara, Cox, & Enns, 2001; Henry & Crawford, 2005) .
In this study, after an evaluation of a total of 17 alternative models for DAS-21 and DASS-9, we have addressed the issue of construct validity and strict measurement invariance for both versions of the instrument. We concluded that both DASS-21 and DASS-9 confirmed the original 3-factor model proposed by Lovibond & Lovibond (1995b) . Additionally, DASS-21 and DASS-9 are measurement invariant across gender. DASS-21 has been a time-proven valuable instrument with sound psychometric properties. Our findings confirmed its merit, suggesting at the same time that DASS-9 has equally sound psychometric properties, in comparison to DASS-21. Moreover, DASS-9 is more parsimonious being shorter, thus valuable when fast screening is required (Yusoff, 2013) .
Nevertheless, this study has certain limitations that should be taken into account when interpreting the results. First, psychology students participated in data collection. The effects of this method are unknown. Second, DASS-9 findings are promising but additional validation is required to different samples and cultural settings. All above limitations considered, the present study contributed to the research of depression, anxiety and stress in Greece by 1) validating the 3-factor structure of DASS; 2) providing evidence that DASS-21 and DASS-9 have sound psychometric properties and 3) they both are measurement invariant across gender.