1.1. Background of the Present Study
Visual information or visualization itself plays an important role in various fields; hence, many researches in terms of it have conducted the development for this sake, so far. The visualization techniques are utilized to observe certain phenomena which are hard to see or that are invisible. Many algorithms of visualization are classified into scientific approaches and informational ones (MacKinlay, 2000) . The former includes visual expressions of data which has a physical component, such as natural or biological phenomena, molecular motion, etc. In contrast, the latter visualization type concerns numerical and text data. In this study, we focus on the latter visualization, this is, analysis of text data and its visualization.
1.2. Previous Studies
1.2.1. Text mining with Verbal Analysis
As for analysis of text data, verbal or text data can have a complex and varied structure, so many kinds of analysis have been developed in text or data mining, e.g., for subjects’ narrations and descriptions. For instance, descriptive data in newspaper articles (Yatsuzuka, 2007) , diary (Cavicchiolo et al., 2015) , and social media posts (Cohn et al., 2004) have all been used in text mining. In addition to data mining processes, verbal data acquired through interviews have also been analyzed by Seale et al. (2006) . These data include information regarding the mental states of participants, such as emotions, attitudes, and personalities.
1.2.2. Qualitative Data Analysis
Recently, qualitative data such as interview data and free-description data are usually processed with no numerical measurement for statistical methods. This is based on theoretical backgrounds with a focus on the KJ method and grounded theory. For instance, nurses or clinical psychologists with field experience conduct such data processing subjectively, but it is desired to construct theories according to discussion with objectivity. Although this is a typical method, qualitative researches are also coming to use statistical methods. For instance, Irving et al. (2014) employed grounded theory methodology in order to examine how health care professionals experienced mindfulness-based medical practice (MBMP) and how participation in MBMP was perceived to be beneficial. As an example of the KJ method, there was a study on calculation of customer needs importance by Ni et al. (2017) . They adopted the KJ method in order to clarify cigarette customer demand included in the voices of customers and classify their needs into five categories.
1.2.3. Visualization of Mental States
In the last decades, the individual mental states have been analyzed with materials that the subjects had written or spoken. Nowadays, the researchers challenge to visualize sequential transitions of events described. For instance, regarding a disaster, Mishne & de Rijke (2006) and Miura et al. (2015) have analyzed and visualized both personal and group emotions which were extracted from the posting on a social media website. In another case, Inami et al. (2013) presented a visualization about the storylines of literature works. Since consideration of diversity inherent in the current society is necessary, emotional interpretation should be required; thus, such problems should be discussed individually due to mental aspects. Therefore, an advanced visualization technique that makes it possible to interpret emotion easily is needed.
1.2.4. Statistical Methods for Analysis
In the psychology and information sciences field, the statistical methods such as factor analysis, principal component analysis (PCA), and multi-dimensional scaling (MDS) for analysis of quantitative data are obtained with questionnaires. For instance, the factor analysis was used for clarifying structure of personality, and the big five model was proposed as the five component factors of personality (Digman, 1990; McCrae & Costa, 1987) . Furthermore, PCA was utilized in text categorization (Lam & Lee, 1999) , and MDS was applied to the development of a knowledge base for sentiment analysis on multiword expressions by Bajpai et al. (2017).
1.3. Proposal of the Present Study
As well as these problems, data analysis and visualization technique that avoids a defect in precision in multi-dimensional data are considered to be essential for accurately and effectively expressing the varying nature of mental states which can be extracted from descriptions and narrations. Considering this incidence, we proposed a method for visualizing the mental states and changes in individuals who can read their words on the two-dimensional plane using self-organizing maps (SOM; Kohonen, 1995; Kanaya et al., 2001 ) and fuzzy cluster analysis (Shinkai, 2008) in the previous study (Aoki et al., 2018) . SOM is known as the primary analysis method for nonlinearly mapping a multi-dimensional data into a lower-dimensional space. By using this method, multi-dimensional data are able to be mapped onto a two-dimensional plane while maintaining their spatial relationship on multi-dimensional space for visualization. Furthermore, we can divide the space in the two-dimensional-plane by adopting a fuzzy cluster analysis, and are able to find the evaluation axis of the mapped data on the plane. By combining these nonlinear analyses, the present study will reveal the evaluation axis about emotion for the multi-dimensional data and the relation between them.
As described in the previous report, the successive studies chose the modal words indicating the subjects’ mental states from the interview articles and use the TF-IDF values calculated from their appearance frequency. Each subject’s data were classified into the two-dimensional plane (map) depending on their TF-IDF values combined with SOM. As the result, the distribution of the modal words on the map was clarified, and the fuzzy cluster analysis was carried out using these results in order to divide the total area of the map into small areas corresponding to each cluster. The psychological evaluation axes could be found by considering the meaning of the vertical, horizontal, and diagonal coordinates on the map in terms of psychology in order to understand the subjects’ mental states from the analysis results.
1.4. Purpose of the Present Study
The present study was carried out behind the background of researches on visualization of psychological state of text data as mentioned above. The purpose of the present study was to clarify the tendency of coordinate transitions on the psychological evaluation axes indicated by individuals and groups―that is, change of mental states. This approach has enabled us to chart a subject’s mental state at a certain point from a visualization result and by recognizing the change of that as a pattern which can be associated with a specific mental change. In this approach, each subject’s coordinate transition was represented quantitatively and then the fuzzy cluster analysis was carried out in order to classify the pattern of the transition. After that, patterns of the mental changes of subject groups were visualized and discussed them.
There is a possibility that extracting potential factors behind people may contribute to the clarification of the theories of unconsciousness originated by Bleuler, Freud, and Jung, and it can be a clue about a part of the mind which does not appear to the human consciousness. People have their own unconscious world which cannot be measured consciously, as typified by dissociative disorder and schizophrenia. Not regarding such states as an evil but integrating consciousness and unconsciousness, as Jung said, enables people to grow up and accomplish things. If the potential factors are extracted, and this can help that we learn clues about human unconsciousness. The results of the present study play an important role for future studies of psychology.
The present study used the verbal data collected in the previous study (Aoki et al., 2018 ) , which were the athletes’ utterances during the interviews and gathered from Internet websites. There were two reasons of applying these data. The first reason is that sports mental training is recently getting more important, considering the influence of psychological factors on sports performance. Psychological assessment of athletes is also considered to be important. Utilization of the results of the present study can be expected in athletes’ mental conditioning, not only by athletes themselves, but also by coaches. The second is that the present study showed the psychological evaluation axes and the individual’s transition on these axes by analyzing the athletes’ utterances. These results enabled us to understand each subject’s mental state and change intuitively. Therefore, common patterns of the transition among the subjects can be extracted from these data, which indicates collective mental changes.
2. Emotional Mapping
2.1. Data Source
The data source used in the present report is based on the materials collected in the previous study by Aoki et al. (2018) . The modal words, indicating the athletes’ mental states, were chosen from those articles. The procedures for data collection and the selection of modal words are reflected briefly as below.
Articles were compiled from athletes interview in the internet website. The athletes chosen were the sixty top sports players of various sports, including Olympic and Paralympic athletes. The verbal data representing the statements of each athlete were divided into each paragraph which was regarded as the utterance .
2.2. Selection of Modal Words
The group of each utterance was analyzed with morphological analysis using software for text mining, Tiny Text Miner (TTM; Matsumura et al., 2009). 4418 morphemes were acquired in the analysis, and the suitable words indicating the athletes’ mental states were chosen among them. According to the frequency in the utterances, the top 20 words were selected as modal words , which consist the data source in the following process.
2.3. SOM Analysis and Fuzzy Clustering
was calculated with for each modal word, and obtained value was used as input vector in SOM analysis. Thus, we updated the values for each
Table 1. Modal words Mi.
representative vector and acquired the classification about in the output layer and. In succeeding process, a calculation of similarity between each neuron on the output layer and fuzzy cluster analysis were carried out in according with the representative vectors distributed in the neurons on SOM mapping. For the output layer of SOM, the map was divided into ten clusters properly through fuzzy cluster analysis. These results supply a psychological evaluation axis on two-dimensional map.
2.4. Extraction of Emotional Transition
The present study will emphasize the emotional transition. Here, we will discuss the movement of input vectors, which was classified into specific neurons on the map. Cluster analysis was carried out for the athletes in order to represent the mental changes of the subjects visually and express the tendencies.
As described above, we got ten clusters on SOM. According to the last paper, we named ten clusters corresponding to each region as as shown in Figure 1.
Next, represents the number of the input vectors contained in each cluster , that is, is the number of input vectors of the c-th athlete in .
Our previous study (Aoki et al., 2018) carried out the fuzzy cluster analysis on the updated representative vectors of the SOM, condensing the thirty neurons of the map into ten clusters. Figure 2 shows the modal words whose TF-IDF values were relatively high among the representative vector components of each neuron.
Focus was placed on the distribution of the modal words and the psychological evaluation axis was considered for clarification of correspondence between coordinates on the map and specific mental states. As a result, the four axes indicating mental states (“displeasure,” “pleasure,” “spontaneity,” “fighting spirit”) and the three axes indicating mental changes (“activation,” “deactivation, “ebullience”) were set as the vertical, horizontal, and diagonal axes on the map, respectively.
In order to classify the tendencies of coordinate transitions on the psychological evaluation axes indicated by individuals and groups, input vector was
Figure 1. Ten clusters on map of SOM.
Figure 2. Distribution of modal words on map.
defined. This expresses the athletes’ track of coordinate transitions on the map. First, we will review several concepts which are used for defining . After that, we will explain our methods for visualizing patterns of mental changes in subject groups and discuss them.
The input vectors calculated from each athlete’s utterances were classified with SOM into any neuron on the map, which is then called a “winner neuron .” The winner neurons can be also classified into any of the clusters on the map. Therefore, the winner neurons were identified, corresponding to the input vectors of each athlete, and classified into . Then, the clusters , into which each athlete’s winner neurons were classified, were defined as . The clusters can be ordered in accordance with the order of an athlete’s utterances . Based on this order, the transitions of were visualized on the map using arrows for each athlete. These arrows were considered to indicate the range of mental change in individuals. Combining the visualization results and the psychological evaluation axes, the athletes’ mental states and changes can be interpreted in terms of psychology.
In addition, we adopted the concept of TF-IDF, which is used in text mining, to classify the patterns of coordinate transitions representing mental states and changes in two-dimensional space. The present study weighted the number of the input vector by following our TF-IDF-like definition. The weighted values are defined as follows, where is the number of athletes and represents the number of the athletes whose input vectors were classified into :
Vectors were used for fuzzy cluster analysis on the sixty athletes. These vectors indicated the ranges into which of each athlete were classified, and were also regarded as the range of mental change.
3. Results and Discussion
3.1. Results of Fuzzy Cluster Analysis
The different numbers of clusters were formed by fuzzy cluster analysis, depending on similarity which is the correlation between all pairs of elements of . The present study focuses on the clusters which were formed in cases where d was 0.70, 0.81, or 0.90 (Table 1), which showed remarkable differences in the number of the clusters.
Table 2 shows that the sixty athletes formed two clusters (clusters 1 and 2) with a d value of 0.70, and fifty-eight athletes belonged to cluster 1. Where d was 0.81, forty-five athletes formed three clusters (clusters 1, 2, and 7) and the other ten clusters (clusters 3 - 6 and 8 - 13) included only one or two athletes each. Where d was 0.90, the athletes formed forty-one clusters of one to three athletes each.
The athletes classified into three clusters with a d value of 0.81 accounted for 75% of the whole dataset, and each of these clusters included more than ten athletes. Because of that, this classification is considered to reflect typical patterns in the athletes’ mental changes.
Table 2. The number of clusters and athletes by each similarity.
3.2. Coordinate Changes of Athletes in Each Cluster
The average vector I was calculated from the vectors of the athletes belonging to cluster 1, in which d was 0.70 (Figure 3). In the same way, the average vector I was calculated from the vectors of those belonging to clusters 1, 2, and 7, wherein d was 0.81 (Figure 4). Only one or two athletes were classified into the other clusters, so we primarily analyzed and discussed the abovementioned four clusters.
When d was 0.70, the average vector components composing the vector I in cluster 1 were all less than 0.10, and there was no significant difference between these values.
When d was 0.81, the average vector components in clusters 1, 2, and 7 all showed different tendencies. In cluster 1, the value of in was greater relative to the others. These results showed the tendency that the input vectors of athletes in cluster 1 were mainly classified into on the map. The representative vector components , and showed higher values in , which corresponded to modal words such as “win ( ),”
Figure 3. Average vector components wt in cluster 1 (d = 0.70).
Figure 4. Average vector components wt in three clusters (d = 0.81).
“superior (M12),” “goal (M1),” and “happy (M6).” Therefore, the athletes in cluster 1, in which d was 0.81, were considered to mention specific goals such as victory in competition more often during the interviews.
In cluster 2, the value of w9 in CL9 was high relative to the others. The representative vector components , and showed large values in CL9, which corresponded to modal words such as “enjoy (M5),” “confidence (M9),” “regret ( ),” and “hard (M3).” Therefore, the athletes in cluster 2, in which d was 0.81, were considered to focus on emotions related to sports, mainly taking pleasure in them.
In cluster 7, the values of w5 in CL5 and w8 in CL8 were greater relative to the others. The representative vector components , and showed large values in CL5, which corresponded to modal words such as “bad (M7),” “mistake (M14),” and “concentration (M18).” Similarly, , and showed large values in CL8, which corresponded to modal words such as “hard (M3),” “bad (M7),” and “enjoy (M5).” Therefore, the athletes in cluster 7, wherein d was 0.81, were considered to focus on mistakes and difficulties in competition.
Furthermore, the average vectors I were calculated from the vectors of those belonging to the following twelve clusters in which d was 0.90 (Figure 5, Figure 6). These were sub-clusters of clusters 1, 2, and 7 (where d was 0.81). Cluster 1 was composed of sub-clusters 1, 4, and 9; cluster 2 contained sub-clusters 2, 10, and 13; and cluster 7 was comprised of sub-clusters 8, 15, 19, 20, 22, and 24.
Focusing on clusters 1, 4, and 9 in Figure 5, w4 in CL4 is the largest value among these clusters. In the same way, w9 in CL9 is the largest value among clusters 2, 10, and 13. These clusters showed the same tendency of their higher clusters: In clusters 1 and 2, d was 0.81.
In the other clusters, (clusters 8, 15, 19, 20, 22 and 24), w5 in CL5 and w8 in CL8 have relatively high values, as well as w4 in CL4 and w7 in CL7. Therefore, the athletes in these clusters were considered to use a variety of modal words during the interviews, and their mental states and changes can be characterized by various factors.
3.3. Mental Changes Shown by the Athletes Belonging to Each Cluster
In Figures 7-10, we overlaid the visualization images for the transitions of by the athletes belonging to the same cluster in Table 2: cluster 1 where d was 0.70 (in Figure 7); and clusters 1, 2, and 7 where d was 0.81 (in Figure 8-10). These figures enable us to recognize tendencies of transitions on the map shown by the athletes in these clusters―that is, common patterns of mental changes in the group. Furthermore, each figure is accompanied by the image of its sub-cluster, to facilitate consideration of the patterns of mental changes in more detail.
Figure 5. Average vector components wt in six clusters (d = 0.90).
Figure 6. Average vector components wt in six clusters (d = 0.90).
(a) (b) (c) (d)
Figure 7. Visualization of cluster 1 as d was 0.70. (a) Cluster 1 (d = 0.70); (b) Cluster 1 (d = 0.81); (c) Cluster 2 (d = 0.81); (d) Cluster 7 (d = 0.81).
(a) (b) (c) (d)
Figure 8. Visualization of cluster 1 as d was 0.81. (a) Cluster 1 (d = 0.81); (b) Cluster 1 (d = 0.90); (c) Cluster 4 (d = 0.90); (d) Cluster 9 (d = 0.90).
Cluster 1 (d = 0.70) was divided into thirteen sub-clusters; clusters 1, 2, and 7 (d = 0.81) included more than ten athletes each. These clusters indicated different tendencies of coordination transitions on the map. The transition on the activation/deactivation axis was mainly observed in cluster 1. This tendency was related to the athletes’ use of the modal words “goal,” “win,” and so on. The athletes in cluster 1 were considered to focus on goal achievement in competition, which indicated their interest in external objects and a conscious aspect of inner worlds.
The transition from CL1 to CL9 was mainly demonstrated by cluster 2, which was considered to indicate increased spontaneity and pleasure. This tendency was related to the athletes’ use of the modal words “enjoy,” “confidence,” and so on. The athletes in cluster 2 might pay attention to positive emotions in competitive life, which was considered to reflect their interest in their own inner worlds, especially the unconscious aspect.
Transition on the displeasure axis was mainly shown in cluster 7. This tendency was related to the athletes’ use of the modal words “hard,” “bad,” and so
(a) (b) (c) (d)
Figure 9. Visualization of cluster 2 as d was 0.81. (a) Cluster 2 (d = 0.81); (b) Cluster 2 (d = 0.90); (c) Cluster 10 (d = 0.90); (d) Cluster 13 (d = 0.90).
(a) (b) (c) (d) (e) (f) (g)
Figure 10. Visualization of cluster 7 as d was 0.81. (a) Cluster 7 (d = 0.81); (b) Cluster 8 (d = 0.90); (c) Cluster 15 (d = 0.90); (d) Cluster 19 (d = 0.90); (e) Cluster 20 (d = 0.90); (f) Cluster 22 (d = 0.90); (g) Cluster 24 (d = 0.90).
on. The athletes in cluster 7 were also considered to focus on their own inner worlds with an unconscious aspect.
Comparing the above Figures 8-10, clusters 1 and 2 showed the direction of coordinate transitions clearly. Their sub-clusters also showed transitions in the same direction. In contrast, cluster 7 and its sub-clusters indicated transitions in various directions, although the transition on the displeasure axis showed a common pattern. Such variation of the transitions in cluster 7 was considered to reflect the athletes’ mental instability related to feelings of displeasure.
As for the seven axes symbolizing one’s inner state, the axes of “pleasure” and “displeasure” represented unconsciousness relatively near an instinct behind the axis of the spontaneity, which was recognized as consciousness. The axes of “activation” and “deactivation” representing mental energy might correspond to preconsciousness as proposed by Freud, and the axis of ebullience might be the trickster, representing the collective unconscious as proposed by Jung.
Considering the above results, the subjects’ transitions on the psychological evaluation axes can be classified into three types: the transition on the activation/deactivation axis, the spontaneity and pleasure axes, and the displeasure axis. In addition, these results also show that the transition on the displeasure reflects the unconscious aspect of mental changes while the others reflect the conscious or preconscious aspects. The knowledge gained from the present study can be utilized to consider mental support for the subjects.
The present successive study proposed an effective method to provide an evaluation axes of emotion in the two-dimensional space by combining the non-linear analyses, the SOM, and the fuzzy cluster analysis, rather than using conventional statistical analysis. In the present paper, the modal pattern of emotional transitions that the subjects showed on the map was extracted and visualized practically. In particular, we paid attention to the transitions on various evaluation axes, which led to deep understanding of the tendencies of mental changes among the subjects.
We showed the feasibility to exhibit the psychological axes for mind evaluation, for an integrating consciousness and an immeasurable unconscious in the human mind. The results obtained with the fuzzy cluster analysis demonstrated the mental states of the current groups, from which we regarded the classified results as the ratio of consciousness to unconsciousness accounting for mind in deep. The results indicate fractuation of mental states of athletes and those differences. For instance, cluster 1 (d = 0.81) focused a conscious aspect of mental states and transition. Cluster 2 (d = 0.81) indicated that the athletes focused on their own unconscious aspect such as pleasure and displeasure.
Consequently, the present processing would be adaptable and useful for clarifying subject’s unconscious mind by analyzing the data source that based on verbal and textual data detected in various fields. For instance, sports coaches of team sports can grasp effectively tendencies of mental states and changes of athletes belonging their team by analyzing their utterances during interviews, descriptions in daily reports, and so on.
Finally, we discuss future issues concerning the proposed method. The successive studies were carried out on the specific athletes so that the results of these studies, the psychological evaluation axes and the patterns of the transition on the axes cannot be necessarily applied to other subjects. It is necessary to accumulate and compare analysis results of verbal data obtained from a wide range of subjects in order to extract universal evaluation axes and patterns of transition.
Conflicts of Interest
The authors declare no conflicts of interest regarding the publication of this paper.