Differences between children appear very early in life. One such difference is children’s temperament. It has been an important clinical and research issue. However, the definition of temperament is debatable: many researchers have defined it differently (Goldsmith, Buss, Plomin, Rothbart, Thomas, Chess, Hinde, & McCall, 1987). Most of the research on child temperament has been focused on dimensions of temperament. Factor analysis of rating scales of temperament has yielded several factors. Whereas the majority of studies on temperament are variable-centered, temperament structure has rarely been challenged from a person-centered perspective (i.e., typology of temperament). A seminal report by Thomas & Chess (1977) identified three types: easy, difficult, and slow-to-warm-up. Easy babies are cheerful, easy to calm, and able to adjust to new situations without difficulty. Difficult babies are slow to adjust to a new experience and react negatively and intensely. Slow-to-warm-up babies are difficult at first but gradually become easier.
Although Thomas & Chess’s (1977) proposal has gained world-wide recognition, little empirical evidence has been demonstrated. In 1995, Caspi & Silva (1995) used the scores of 3 temperament factors (lack of control, approach, and sluggishness) to perform cluster analysis of over 800 3-year-old children and they identified five clusters, i.e., groups of children: under-controlled, inhibited, confident, reserved, and well-adjusted. A similar but different approach was conducted with Q-sort patterns (Asendorpf & van Aken, 1999) that identified three prototypic patters. Robins, John, Caspi, Moffitt, and Fisher (2001) analyzed the data of the California Child Q-Set (CCQ) among children aged 12 to 13 years old by Q-factor analysis. This yielded three types: overcontrollers, undercontrollers, and resilients. Aksan et al. (1999), in a multi-wave (1, 4, and 12 months; and 2, 3, and 4 years) study, used configural frequency analysis and yielded two types: controlled-nonexpressive and noncontrolled-expressive. In Sanson et al.’s (2009) study, 200 children were assessed on four occasions (4 - 8 months, 1 - 2 years, 2 - 3 years, and 3 - 4 years) by different scales (Revised Infant Temperament Questionnaire, Toddler Temperament Scale, and Childhood Temperament Questionnaire). A hierarchical cluster analysis with a dendrogram showed that a 4-cluster model was the best. Subsequently, k-means with 3- to 6-cluster models were used to measure distances between cluster centers. Again, a 4-cluster model (nonreactive/outgoing, high attention regulation, poor attention regulation, and reactive/inhibited) was found to be the best. Prokasky et al. (2017) examined three samples of children (n = 96, 187, and 757) aged around 4 years old with the Child Behavior Questionnaire (CBQ: Rothbart et al., 2001). Seven subscales (activity, anger, approach, fear, shyness, attention focusing, and inhibitory control) were entered into a hierarchical cluster analysis using Ward’s method with squared Euclidean distance as a means of distance between cases. The best model was identified by comparing k-mean cluster analyses and the best was defined as the one that showed the most similar patterns between samples. As a result, a 6-cluster model was identified as the best: unregulated, reactive, bold, subdued, regulated, and inhibited. These studies have not yet arrived at a consensus as to the best temperamental typology possibly because of, among other reasons, use of different temperament measures and different clustering methods.
One of the statistical tools used to identify types according to individual differences is cluster analysis (Borgen & Barnett, 1987). Widely used clustering algorithms, i.e., agglomerative hierarchical cluster analysis and k-means cluster analysis, however, suffer from methodological drawbacks. The former is characterized by ambiguity of determining the appropriate number of clusters whereas the latter demands that the researcher determine the number of clusters a priori. In the two-step cluster analysis, the number of clusters is automatically determined without the researcher’s idiosyncrasy. In the first step, the initial number of clusters is calculated by means of the Schwartz Bayesian Criterion or the Akaike Information Criterion. This is then followed by refinement by finding the largest increase in distance between the two closest clusters in each hierarchical clustering stage. The two-step cluster analysis has recently been used by social science researchers (e.g., Satish & Bharadhwaj, 2010).
We report here a study of temperamental typology of Japanese toddlers using the EASI survey and a two-step cluster analysis. We also examine the validity of the typology in terms of the children’s internalized and externalized behavior problems.
2.1. Study Procedures and Participants
The present study was an internet-based survey conducted with the cooperation of Rakuten Insight Inc. (Shibuya, Tokyo). The target of this investigation was 3- to 4-year-old Japanese children. Parents who live with their 3- to 4-year-old (36 to 59 months) child were solicited from 47 prefectures in Japan. From a total of over 400,000 Rakuten internet members, 246,578 had children and thus were invited to participate in the survey. Inclusion criteria were 1) the participants were daily caregivers of their child, 2) their primary language was Japanese, and 3) their residence was in Japan since childbirth. Using screening questions and Rakuten’s monitoring system, those who did not meet the inclusion criteria and those who had given false answers in the past online survey were excluded. Eligible parents were selected on a first-come-first-serve basis. A total of 900 parents including 531 mothers and 369 fathers were invited. Their mean (SD) age was 37.6 (5.5) years old. Boys (n = 465) and girls (n = 435) were almost evenly distributed. Regarding the birth order of the children, 481 were the first children, 322 the second, and 84 the third. The children’s mean (SD) age was 47.7 (6.3) months old. It was 48.1 (6.3) and 47.2 (6.3) months old for boys and girls, respectively. The incentive was electronic money points which could be used for internet shopping. The study was conducted from April 28 to May 8, 2018.
The EASI Survey consists of 20 items with a 5-point scale (from “a little”-0 to “a lot”-4) to measure four temperament dimensions: Emotionality (E), Activity (A), Sociability (S), and Impulsivity (I) (Buss & Plomin, 1975, 1984). Emotionality is focused on unpleasant emotions such as distress, fear, and anger. Activity is a person’s energy output, thus equivalent to movement. Sociability is the only temperament that has a directional component such as seeking out other people, preferring their presence, and responding to them. Impulsivity reflects sensation seeking and lack of inhibitory control, decision time, and persistence (Ohashi & Kitamura, 2017). One of us (TK) translated the EASI into Japanese with permission from the original authors. Our previous study demonstrated acceptable fit with the data for the original 4-factor structure of the instrument using a selected number of EASI items (3 items for E, A, and S each and 5 items for I) with a general factor combining E and I (Ohashi & Kitamura, 2019) (Table 1). The model’s goodness-of-fit showed an acceptable fit. This model also satisfied measurement and structural invariance between fathers and mothers, boys and girls, 3- and 4-year-olds, and times 1 and 2. Accordingly, we used the modified EASI in our present analysis. We calculated subscale scores by adding scores of items belonging to each factor.
The Japanese version (Funabiki & Murai, 2017) of the Child Behavior Checklist for Ages 1 –5 (CBCL/1 –5: Achenbach & Rescorla, 2000) was used to measure the child’s psychopathology: internalized and externalized behavior problems. It includes 100 problem items: 99 closed items and one open-ended item, which requests that the respondent add any additional problems not listed. The instrument covers an empirical range of behavioral, emotional, and social function problems. According to the instruction guide, we calculated internalized and externalized behavior problem scores using the score of the 99 closed items.
2.3. Data Analysis
Cluster analysis is a technique to classify cases into groups that are homogenous within themselves and heterogeneous between each other based on the characteristics of the symptoms in question (Borgen & Barnett, 1987). This group is called a cluster. Unlike other cluster techniques such as k-mean and hierarchical cluster analyses, a two-step cluster analysis is unique in that it creates clusters based on both categorical and continuous variables (Satish & Bharadhwaj, 2010). K-mean and hierarchical cluster analyses only deal with continuous variables. Selection of the number of clusters in a k-mean analysis is predetermined by the researcher. During the process of sequentially combining the nearest cases in a hierarchical cluster analysis, the occurrence of a big increase in the distance between the cluster from one stage to another is a sign that the number of clusters just before that big “jump” is the best cluster model. On the other hand, a two-step cluster analysis selects the number of clusters automatically. The procedure starts with the construction of a cluster features tree that creates “nodes” containing multiple cases. In the second step, agglomerative clustering is used to produce a range of solutions. It automatically confirms the maximum possible number of clusters. This will be followed by a determination of the best cluster model in terms of the highest distance increase (measured by Schwarz’s Bayesian Criterion or Akaike Information Criterion) between the two closest cluster models during each stage of the hierarchical clustering (Sarstedt & Mooi, 2014; SPSS, 2001). Two-step cluster analysis can also deal with large data files efficiently.
2.4. Ethical Considerations
This study was approved by the Institutional Review Board (IRB) of the Kitamura Institute of Mental Health Tokyo (No. 2018120801).
A two-step cluster analysis yielded 4 clusters. We performed a one-way analysis of variance (ANOVA) for the scores of the 4 EASI subscales. All 4 EASI subscale scores differed significantly (p < .001) between the clusters (Table 2). The first cluster consisted of 288 children. They were characterized by the highest S scores and mildly high A and I scores. The second cluster consisted of 179 children. They were characterized by extraordinarily low E scores, the lowest A and I scores, and mildly high S scores. The third cluster consisted of 288 children. While their I and E scores are almost the same level as the first cluster, they were characterized by mildly low A and S scores. The last cluster consisted of 145 children. They were characterized by the highest E, A, and I scores and the lowest S scores. We interpreted the first, second, third, and fourth clusters as Average-Active, Regulated, Average-Quiet, and Sensitive/Hyperreactive, respectively (Figure 1).
When the children of the four clusters were compared in terms of the CBCL scores, the internalizing and externalizing behavior scores as well as the total score all differed significantly (p < .001). Children in the Regulated cluster scored the lowest in all of the subscale and total scores followed by children in the Average-Active and Average-Quiet clusters. Children in the Sensitive/Hyperreactive cluster scored the highest in all of the subscales and total scores (Table 2).
Our two-step cluster analyses identified 4 clusters. A majority of the children belonged to the Average-Quiet (Cluster 3) and Average-Active (Cluster 1) clusters. Both of these groups of Average cluster children were in the middle of the four clusters in terms of E and I subscales. In addition, Average-Quiet children scored lower in A and S, whereas Average-Active children scored higher these subscales. That is, the children of these categories are ordinary, where the former are quieter and the latter more energetic. On the other hand, the other two clusters of children seem to have extreme traits. The children in the Regulated cluster (Cluster 2) scored lowest in E, A, and I but were very sociable. These children seem to be stable in their surroundings, and friendly with other people. Sensitive/Hyperreactive children (Cluster 4) were the highest in E, A, and I but the
Table 2. Means (SDs) of EASI and CBCL scores by each cluster, and construct validity.
NS, not significant.
Figure 1. Scores of EASI subscale across the children in the four clusters.
least sociable. These children seem to be highly sensitive, emotionally unstable, and unsociable.
Construct validity was sought by associations of the temperament typology and the CBCL subscale scores. As expected, children in the Regulated cluster scored the lowest in terms of both internalized and externalized behavior problems while children in the Sensitive/Hyperreactive cluster scored the highest. Regulated cluster children may easily adapt to a change of surroundings so that they have few behavioral problems. In contrast, Sensitive/Hyperreactive cluster children were extremely sensitive to an environmental change so that their confusion may be expressed as various behavioral problems.
Previous studies on personality and temperament suggested 3 to 6 temperament types in childhood and adolescence (Thomas & Chess, 1977; Caspi & Silva, 1995; Robins, John, Caspi, Moffitt, & Stouthamer-Loeber, 1996; Aksan et al., 1999; Sanson et al., 2009; Prokasky, Rudasill, Molese, Putnam, Gartstein, & Rothbart, 2017). They used different nomenclature to describe temperament types (Table 3). A possible reason for this lack of consensus may be the use of different statistical methods. For example, a combination of hierarchical and k-mean cluster analysis was used by Sanson et al. (2009) and Prokasky et al. (2017), configural frequency analysis by Aksan et al. (1999), and Q-factor analysis by Robins et al. (1996). As noted, a drawback of cluster analyses is how to determine the number of clusters. This often depends on the researchers’ arbitrary impression. The two-step cluster analysis, however, leaves this to the predetermined statistical rules. This is a strength of our study. In the present study, we revealed 4 clusters in Japanese 3- to-4 year-old children using a two-step cluster analysis which excluded researchers’ arbitrariness.
Table 3. Comparison of temperament typologies across studies.
Despite the lack of consensus about temperament typology, there are some similarities between our study and the reports of other researchers. Sanson et al. (2009) identified four typologies of children: Hyper attention regulation, Reactive/inhibited, Nonreactive/outgoing, and Poor attention regulated. Regulated and Sensitive/Hyperreactive children in our study are similar to Hyper attention regulation and Poor attention regulated, respectively, in the Sanson et al. study. Similarly, children in the Average-Active and Average-Quiet clusters in our study may be close to children in Reactive/inhibited and Nonreactive/outgoing, respectively, in the Sanson et al. study. Three typologies were reported by other researchers even across different cultures (Hart et al., 1997; Robins et al., 1996; cited by Aksan et al., 1999). These types are labeled overcontrolled, undercontrolled, and resilient. The results of our study are not completely consistent with those of previous studies; however, it could be compatible with those studies (Table 3), and clinically interpretable clusters.
Another possible reason for a lack of consensus about children’s temperamental typology is the difference among raters and in children’s age. In our study, children’s temperament was rated by fathers and mothers whereas most of the previous studies used mothers or teachers as raters. Research shows that fathers and mothers often differ in assessing the behavioral traits of their own children (Leblanc & Reynolds, 1989; Kitamura, Ohashi, Minatani, Haruna, Murakami, & Goto, 2015). Rater differences may bias the results. Our study and previous studies investigated children of different age groups. Even the use of the same measure of temperament may cause different behavioral expressions in children at different developmental stages. For instance, a year’s difference in a child’s age may cause differences in behavioral expression. We were cautious in that we confirmed configural, measurement, and structural invariance of the EASI between fathers and mothers, age of 3 and 4 years, and boys and girls (Ohashi & Kitamura, 2017). This is another strength of this study.
Cultural difference is another issue of methodological importance. Use of the same instrument that was correctly translated does not necessarily guarantee similar responses from participants (Iwata, Roberts, & Kawakami, 1995; Iwata, Umesue, Egashira, Hiro, Mizoue, Mishima, & Nagata, 1998). Although it goes beyond the scope of this study, invariance of the instrument’s structure should be carefully examined across different cultures and languages.
As noted, previous studies, when searching temperament typologies, used different instruments. Instruments were developed based on different temperament theories. Therefore, future studies should consider, for example, using different temperament measures simultaneously for the same participants so that we may identify temperament dimensions as well as typologies regardless of the theories underlying the instruments.
The identification of such groups could help us to select different parenting or intervention strategies based on typology. It has been known that children’s temperaments and parenting behavior independently influence one another. Recently, it has been recognized that the effects of parenting on children’s psychopathology depends on the child’s temperament. For example, consistent parental strategies are particularly important for children who have difficulties with self-regulation (Webster-Stratton, 2006). The parents should be supported to increase the “goodness-of-fit” between the parenting strategies and their child’s temperament (Schermerhorn & Bates, 2012).
Taking these limitations into consideration, our study showed similar results to some extent regarding child typology. To the best of our knowledge, this study is the first temperament typology study in a large population of Japanese children. This study identified four clinically interpretable clusters. The advantage of the person-centered typology is that it can help us to identify groups of children with particular temperament profiles.
We are grateful for all of the participants and the members of the Institutional Review Board of the Kitamura Institute of Mental Health Tokyo, who provided ethics advice on the net-survey.
This study was supported by JSPS KAKENHI Grant Number JP16K12170 (PI: Yukiko Ohashi).
 Aksan, N., Goldsmith, H. H., Smider, N. A., Essesx, M. J., Clark, R., Hyde, J. S., Vandell, D. L. et al. (1999). Derivation and Prediction of Temperamental Types among Pre-Schoolers. Developmental Psychology, 35, 958-971.
 Asendorpf, J. B., & van Aken, M. A. G. (1999). Resilient, Overcontrolled, and Undercontrolled Personality Prototypes in Childhood: Replicability, Predictive Power, and the Trait-Type Issue. Journal of Personality and Social Psychology, 77, 815-832.
 Caspi, A., & Silva, P. A. (1995). Temperamental Qualities at Age Three Predict Personality Traits in Young Adults: Longitudinal Evidence from a Birth Cohort. Child Development, 66, 486-498. https://doi.org/10.2307/1131592
 Funabiki, Y., & Murai, T. (2017). Standardization of a Japanese Version of Child Behaviour Checklist for Ages 1½-5 and the Caregiver Teacher Report Form. Japanese Journal of Child and Adolescent Psychiatry, 58, 713-729. (In Japanese)
 Goldsmith, H. H., Buss, A. H., Plomin, R., Rothbart, M. K., Thomas, A., Chess, S., Hinde, R. A., & McCall, R. B. (1987). Rounttable: What Is Temperament? Four Approaches. Child Development, 58, 505-529. https://doi.org/10.2307/1130527
 Iwata, N., Roberts, C. R., & Kawakami, N. (1995). Japan-U.S. Comparison of Responses to Depression Scale Items among Adult Workers. Psychiatry Research, 58, 237-245.
 Iwata, N., Umesue, M., Egashira, K., Hiro, H., Mizoue, T., Mishima, N., & Nagata, S. (1998). Can Positive Affect Items Be Used to Assess Depressive Disorders in the Japanese Population? Psychological Medicine, 28, 153-158.
 Kitamura, T., Ohashi, Y., Minatani, M., Haruna, M., Murakami, M., & Goto, Y. (2015). Disagreement between Parents on Assessment of Child Temperament Traits. Pediatrics International, 57, 1090-1096. https://doi.org/10.1111/ped.12728
 Leblanc, R., & Reynolds, C. R. (1989). Concordance of Mothers’ and Fathers’ Ratings of Children’s Behavior. Psychology in the Schools, 26, 225-229.
 Ohashi, Y., & Kitamura, T. (2017). Emotionality Activity Sociability and Impulsivity (EASI). In V. Zeigler-Hill, & T. Shackelford (Eds.), Encyclopedia of Personality and Individual Differences (pp. 1-3). Berlin: Springer.
 Ohashi, Y., & Kitamura, T. (2019). The EASI: Factor Structure and Measurement and Structural Invariance between the Parent’s Gender, the Child’s Age, and Two Measurement Time Points. Psychology, 10, 2177-2189.
 Prokasky, A., Rudasill, K., Molese, V. J., Putnam, S., Gartstein, M., & Rothbart, M. (2017). Identifying Child Temperament Types Using Cluster Analysis in Three Samples. Journal of Research in Personality, 67, 190-201. https://doi.org/10.1016/j.jrp.2016.10.008
 Robins, R. W., John, R. W., Caspi, A., Moffitt, T. E., & Stouthamer-Loeber (1996). Resilient, Overcontrolled, and Undercontrolled Boys: Three Replicable Personality Types. Journal of Personality and Social Psychology, 70, 157-171.
 Sanson, A., Letcher, P., Smart, D., Prior, D., Toumbourou, J. W., & Oberklaid, F. (2009). Associations between Early Childhood Temperament Clusters and Later Psychosocial Adjustment. Merrill-Palmer Quarterly, 55, 26-54. https://doi.org/10.1353/mpq.0.0015
 Sarstedt, M., & Mooi, E. (2014). A Concise Guide to Market Research: The Process, Data, and Methods Using IBM SPSS Statistics (pp. 273-324). Berlin: Springer.
 Satish, S. M., & Bharadhwaj, S. (2010). Information Search Behaviour among New Car Buyers: A Two-Step Cluster Analysis. IIMB Management Review, 22, 5-15.
 Schermerhorn, A. C., & Bates, J. E. (2012). Temperament, Parenting and Implications for Development. Encyclopedia on Early Childhood Development. 2012-2017 CEECD/ SKC-ECD.
 SPSS (2001). The SPSS Two Step Cluster Component: A Scalable Component Enabling More Efficient Customer Segmentation. Chicago, IL: SPSS.